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Conclusions vs Decisions* 


Joun W. TuKeEy 


Princeton University and 
Bell Telephone Laboratories, Murray Hill 


With the exception of appendices 2 and 3, the following is based on the after 
dinner talk given by Professor John W. Tukey at the first meeting of the Section of 
the Physical and Engineering Sciences of the American Statistical Association held 
in New York City on May 26, 1955. This talk was repeated at a later date before a 
dinner meeting of the Metropolitan Section of the American Society for Quality 
Control. On both occasions considerable discussion ensued. The talk is published 
here both for the record, and in the hope that some readers may be stimulated to 
prepare written rejoinders. 


INTRODUCTION 


My subject tonight should be both interesting and professionally relevant, 
and yet should not involve formulas or a blackboard. Of the topics most pro- 
fessionally relevant to statisticians, I must choose between human relations, as 
between statistician and client, and statistical philosophy, both subjects where 
our practices often outshine our formal philosophy, both subjects where more 
discussion and better understanding are needed if our practices are to improve 
as fast as they should. 

It is especially important that our discussion and understanding of statistical 
philosophy be firm and well-balanced. For one-sided development, no matter 
how important the single aspect may be, will ultimately deflect some, if not all, 
of our practices into unwise bypaths. 

I have been concerned for a number of years with the tendency of decision 
theory to attempt the conquest of all statistics. This concern has been founded, 
in large part, upon my belief that science does not live by decisions alone—that 
its main support is a different sort of inference. 

Effective discussion of this problem, and a real start townrd the development 
of a consensus of opinion, has been retarded by the absence of a word for this 
other sort of inference, a word which could be contrasted with “decisions”. For 
me, there is now a word. (Some dislike it, but no one has suggested a better 
choice.) The word is “conclusions”. Conclusion theory is intended, not to replace 
decision theory, but to stand firm beside it. 

Because I believe that conclusions are even more important to science than 
decisions, it is particularly appropriate that I am able to speak to the first 
meeting of the ASA’s new Section on Physical and Engineering Sciences about 
the relations, and the differences, between decisions and conclusions. I know of 
no better way to wish the Section well than to encourage its membership to 


* Prepared in part in connection with research sponsored by the Office of Naval Research. 
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thought and discussion on a topic which I believe will remain important to the 
carrying out of the functions of all of its members. 


Decisions, WHat ARE THEY? 


Some of us have read about decision theory, most of us have heard of it, and 
all of us make decisions. But do we have a clear idea of what a decision-theorist’s 
decision is? Have the books made the essential situation clear? Or have they 
discussed only the externals of a single formulation? In fact, there has been so 
little discussion of essentials that I have had to formulate my own idea of what 
a “decision’’, in the sense of modern decision theory, really is. 

The decisions of practice are far more nearly of the form “‘let us decide to act 
for the present as if” than of the form long conventional in treatments of decision 
theory—‘‘we accept”. The distinction is important and too often neglected. 
The restrictions “act --- as if” and “for the present”’ convey two separate and 
important ideas, ideas which serve to distinguish conclusions from decisions, 
ideas which epitomize much of what I wish to say. 

When an engineer must choose at once between two ways of building a bridge, 
or a doctor must choose which of two treatments to apply to a patient who is 
critically ill, or when a businessman must choose between two policies for the 
season that is now upon him, each must weigh alternative A against alternative 
B in this immediate situation, and strive to select the alternative that will yield 
the bigger reward, whether this reward be a cheaper safe bridge, a better chance 
of recovery for the patient, or a more profitable season. The possible actions are 
defined, their consequences in various “states of nature’ are understood, and 
some evidence about these states of nature is at hand. In each instance the 
individual must judge whether to act as if the reward from alternative A will 
indeed prove to be greater than that from alternative B, (which we may abbrevi- 
ate “‘A > B’’), or whether the opposite is true (“A < B’”’). 

The three alternative decisions: 


(1) to act in the present situation as if A > B, 
(2) to act in the present situation as if A = B, 
(3) to act in the present situation as if A < B 


seem to me reasonably stated, while the conventional statements of the aterna- 
tives: 

(1’) to accept A > B, 

(2’) to accept A = B, 

(3’) to accept A < B 


seem to have been (unconsciously) well calculated to mislead the reader or 
student. 

When we say “act as if A > B’’, we have made no judgment as to the “truth” 
or “certainty beyond a reasonable doubt” of the statement “A > B’”. When 
we say “for the present”, we are referring only to the particular situation under 
consideration at present. Thus what we have done is to weigh both the evidence 
concerning the relative merits of A and B and also the probable consequences in 
the present situation of various actions (actions, not decisions!). Finally, we have 
decided that the particular course of action which would be appropriate if A 
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were truly > B is the most reasonable one to adopt in the specific situation that 
faces us. 

When we say “act as if A > B” and “in the present situation”, we assert no 
judgment as to the “truth” or “certainty beyond a reasonable doubt” of the 
statement “‘A > B”, and we make no judgment about the wisdom of choosing 
among actions in all, or even many, of the situations in which a knowledge that 
A was truly > B would determine a wise man’s choice. The consequences in 
other situations of acting as if A > B have not been considered. It is important 
that we have not done these things; it is perhaps even more important that we 
know that we have not done them. 

What has been done is simple and specific. The evidence concerning the relative 
rewards from the alternatives has been weighed: The consequences in the present 
situation of various actions (not decisions!) have been assessed. We have decided 
that, in this single specific situation, the particular action that would be appropri- 
ate if A were truly > B is the most reasonable action to take. 

Two sorts of special cases may help to tie down these remarks: It is often 
necessary to make a decision on the basis of no formal data at ali. (Consider the 
hen crossing the road!) It may be reasonable to make two opposite decisions at 
the same time with regard to different actions. (How many of us both save for 
our own future and carry life insurance, perhaps even in a single policy? One is 
a decision to act as if we will live, the other a decision to act as if we will die!) 

Decisions to “act for the present as if” are attempts to do as well as possible 
in specific situations, to choose wisely among the available gambles. 


Conc.iusions, WHat May Tuey Br? 


Like any other human endeavor, science involves many decisions, but it 
progresses by the building up of a fairly well established body of knowledge. 
(One whose relevance is supposed to be broad.) This body grows by the reaching 
of conclusions—by acts whose essential characteristics differ widely from the 
making of decisions. Conclusions are established with careful regard to evidence, 
but without regard to consequences of specific actions in specific circumstances. 
(They are, of course, based on specific experiments or observations.) Conclusions 
are withheld until adequate evidence has accumulated. 

A conclusion is a statement which is to be accepted as applicable to the 
conditions of an experiment or observation unless and until unusually strong 
evidence to the contrary arises. This definition has three crucial parts; two 
explicit, and the third implicit. It emphasizes “acceptance’’, in the original, 
strong sense of that word; it speaks of “unusually strong evidence’; and it 
implies the possibility of later rejection. 

First, the conclusion is to be accepted. It is taken into the body of knowledge, 
not just into the guidebook of advice for immediate action, as would be the case 
with a decision. It is something of lasting value extracted from the data. 

Indeed, the conclusion is to remain accepted, unless and until unusually 
strong evidence to the contrary arises. This implies that only a small percentage 
of all conclusions will, in due course, be upset. 

Third, a conclusion is accepted subject to future rejection, when and if the 
evidence against it becomes strong enough. (Only a small proportion of con- 
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clusions will be rejected.) It is taken to be of lasting value, but not necessarily 
of everlasting value. 

These characteristics are very different from those of a decision-theorist’s 
decision. The differences are extremely important. 

It has been wisely said that “‘science is the use of alternative working hypo- 
theses”. Wise scientists use great care and skill in selecting the bundle of alterna- 
tive working hypotheses they use. Conclusions typically reduce the spread of 
the bundle of those working hypotheses which are regarded as still consistent 
with the observations. Hence conclusions must be reached cautiously, firmly, 
not too soon and not too late. And they must be judged by their long run effects, 
by their “truth”, not by specific consequences of specific actions. 


STATISTICAL VS. EXPERIMENTER’S CONCLUSIONS 

As statisticians we must insist upon more than one kind of conclusion, upon 
the difference between “statistical conclusions’ and “experimenter’s con- 
clusions”. A “statistical conclusion” applies to the actual conditions of the 
experiment. If a consistent blunder were made, if the instruments or measure- 
ments yield substantial systematic errors (they will always have some syste- 
matic errors, though we may hope that these are small), if the measurements 
were reduced according to a theory which is incomplete in some important way 
(it will always be incomplete to a certain extent), if the conditions or measure- 
ments were incorrectly recorded, if the importance of important variables were 
not recognized (so that their values were not recorded or reported), the stated 
conclusions are likely to be wrong. Errors for such reasons are not to be charged 
against statistical conclusions. 

But experimenter’s conclusions, be they physical conclusions, chemical con- 
clusions, biological conclusions or engineering conclusions, must take account 
of all these possibilities. In most areas of experiment or observation it will be 
either desirable or necessary for the experimenter to make specific allowance, 
beyond the statistically recognizable uncertainty, for such deviations of the 
actual situation from the supposed situation. For this reason, his conclusions 
will be weaker than the statistical ones. 

This difference, which arises from what may loosely be called the problem of 
systematic error, is an important challenge to the statistician. Both the statis- 
tician’s morale and his integrity are tested when, for example, he has to face the 
possibility of a really substantial systematic error just after he has used all his 
skill to reduce, in the same experiment, the effects of fluctuating errors to 95% 
of their former value. It challenges his relationship to his clients in two opposite 
ways. When his client is quantitatively sophisticated, as many physical and 
engineering scientists are, he must face the systematic errors or lose his client’s 
respect. When his client is not quantitatively sophisticated, as is often the case 
in other fields, he must educate the client at the proper rate, not too rapidly and 


not too slowly—first, perhaps, about fluctuating errors, but eventually about 
systematic errors, too! 


ASYMMETRY CAN Bs EssENTIAL 


We have emphasized the most important differences between decisions and 
conclusions. There is another difference which is not quite among the most 
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important, but which yet deserves a place of its own. This is the treatment of 
doing nothing. 

In most accounts of decision theory, the decision to do nothing is either 
ignored (which is probably the worst thing to do in practice) or treated on a par 
with all the other decisions. In conclusion theory, on the other hand, not coming 
to a conclusion plays a very special role. Three instances may help us to reflect 
on this distinction: 


(1) All of us who were originally brought up in physical or biological science 
feel quite clearly, I am sure, that “‘to be not yet certain” is very different 
from other attitudes about a question. 

(2) We may be surprised to find a related attitude among administrators— 
Chester I. Barnard, on page 194 of The Functions of the Executive”, says 
(his italics) “The fine art of executive decision consists in not deciding 
questions that are now not pertinent, in not deciding prematurely, in not 
making decisions that cannot be made effective, and in not making decisions 
that others should make.” 

(3) An active worker in decision theory told me recently that the decision to 
do nothing was ‘‘the only decision without a loss function”. 


Each of these emphasizes, in a different way, the distinctive character of “doing 
nothing”. Each deserves further examination. (Appendix 2 will treat (1).) 

Barnard’s statement implies that the “decisions” of the executive are much 
more nearly what we have called conclusions than what we have called decisions. 
They are not to be entered upon lightly, and there is a clear implication that, 
once reached, they are to be referred to for some time as part of a growing body 
of doctrine. 

The decision-theorist’s statement in (3) reveals him, it seems to me, as one 
who is really in search of conclusions. Why else is “doing nothing” so different? 
It is an action, one that can, in particular, lose money. 

Decision theory ought to be symmetrical with regard to the action ‘“‘do 
nothing”. Conclusion theory must be unsymmetrical with regard to the action 
“conclude nothing”’. 


Trsts oF SIGNIFICANCE 


The prototype of modern experimental statistics was the test of significant 
difference. It came first as a tool of analysis and inference, not as a tool of mere 
description. When we examine its purport in the framework we are describing, 
we find that it is a qualitative conclusion procedure. Its purpose is to answer the 
question ‘‘Dare we conclude that this difference is not zero?’’. 

We may, on the basis of a test of significance, conclude that A ~ B, or even 
more specifically that A < B or A > B. But failure to attain significance is not, 
of itself, intended to produce a conclusion, is not intended to be accepted, in 
that strong sense of the word “accept” which is relevant to conclusions. 

Where do we stand when the difference between A and B has not reached 
“significance”? Some would like to wield Occam’s razor and say that ‘““We have 
shown that A = B’”’. Surely we have not concluded that A = B. For no quanti- 
tative evidence can establish that A is not just a very little different from B. 
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Perhaps we have decided that A = B, but if so, for what specific situation, on 
what evidence, and with what assessment of consequences? 

To interpret appropriately a failure to attain significance, it is necessary to 
know something about the precision of the comparison, to know how close 
there is reason to believe A is to B. Only by advancing into the use of confidence 
techniques (about which more anon) can a negative statement about significance 
be converted into a positive conclusion, a conclusion of established. smallness of 
difference. 


Tests or Hyporuesis 


Symmetry, and mathematical simplicity, seemed to lead along a straight 
path from tests of significance to tests of hypothesis. As the procession traversed 
this path, few if any stopped to see where they had gone—to notice that they 
had left a qualitative conclusion procedure and had come to what was sus- 
piciously like a qualitative decision procedure. 

The choice between two simple hypotheses can be viewed in two quite different 
ways: 


(1) as an attempt to choose the best risk, without regard to certainty— 
which is surely a decision procedure, or 

(2) as an attempt to control, often by a sequential procedure, both kinds of 
error (both the error of accepting the hypothesis when it is false, and the 
error of rejecting it when it is true) at suitably low levels—which is, on 
the face of it, a conclusion procedure. 


The aim of (2) can be expressed as follows: ““‘We will take enough observations 


to allow us to dare to conclude either that the first hypothesis is false, or that the 
second hypothesis is false, but we shall not try to conclude that both are false, 
even if the observations prove adequate to do this.” The form of this statement 
is clearly that of a conclusion procedure, though it is natural to wonder at the 
presence of its last proviso. 

If, on the other hand, the aim is really (1), to choose the best risk, then there 
is no real place in the procedure for the artificial limitations of 5%, of 1%, or 
of any of the conventional significance or confidence levels. If nothing is to be 
concluded, only something decided, there is no need to control the probability 
of error. (Only the mathematical expectation of gain needs to be positive to 
make a small gamble profitable. There is no need for high confidence in winning 
individual bets. A coin which comes heads 60% of the time will win more money 
safely than one that comes heads 95% or 99% of the time.) 

Until we go through the accounts of the testing of hypotheses, separating 
decision elements from conclusion elements, the intimate mixture of disparate 
elements will be a continual source of confusion. 'The writer looks forward to 
the day when the history and status of tests of hypotheses will have been dis- 
entangled. (See also Appendix 3.) 


ESTIMATION 


Older by far than any other statistical techniques are point estimates, each 
a simple indication of the value the data seem to point out or suggest. They are 
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quantitative, where the other classical procedures we have so far discussed are 
qualitative. They are attempts to do the best that we can, not to do only what 
we can be certain about. Hence they are decision procedures, more specifically, 
quantitative decision procedures. So far as our classification goes, they offer no 
problems. 

Probably the greatest ultimate importance, among all types of statistical 
procedures we now know, belongs to confidence procedures which, by making 
interval estimates, attempt to reach as strong conclusions as are reasonable by 
pointing out, not single likely values, but rather whole classes (intervals, regions, 
etc.) of possible values, so chosen that there can be high confidence that the 
“true” value is somewhere among them. Such procedures are clearly quantitative 
conclusion procedures. They make clear the essential “‘smudginess’” of ex- 
perimental knowledge. 


Tue Twin DIcHOTOMIES 


Keeping the varied sorts of statistical inference procedures separate, and 
yet properly related to one another, is important to every statistician. Hopefully, 
the distinction between decisions and conclusions, as well as the distinction 
between qualitative and quantitative, are now clear. 

The writer has found, and continues to find, these twin dichotomies (qualita- 
tive-quantitative and conclusion-decision) most helpful in organizing the 
procedures of statistics into a pattern which is useful both for application and 
reflection. 

Surely the quantitative is preferable to the qualitative whenever both are 
equally available and equally relevant. Thus most qualitative statistical pro- 
cedures are interim measures, introduced to serve until equally relevant quantita- 
tive procedures become available. 

If we use the phrases “‘to do one’s best”. and “to state only that which is 
certain’ as typifying decisions on the one hand, and conclusions on the other, 
we can see that there is a real place for both. And in particular situations we 
can usually tell what these places are. 

To sum things up: the case of qualitative vs. quantitative should have a mixed 
verdict, granting ‘‘qualitative’ squatters rights, but only until “quantitative” 
is ready to move in; while the case of conclusion vs. decisions should be settled 
out of court, with an understanding that cooperation is vital to both parties. 
There is a place for both ‘‘doing one’s best” and “saying only what is certain”, 
but it is important to know, in each instance, both which one is being done, and 
which one ought to be done. 


APPENDIX 1 


Some Conrusinc RELATIONS 


So long as we lacked clearly contrasted words, confusion between decisions 
and conclusions was very easy, especially since they are so thoroughly combined, 
both so frequently, and in almost every possible way. 

Both decisions and conclusions are required in almost every field of human 
endeavor, yet the proportions, mutual relations and relative dominance which 
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are appropriate vary greatly from one field to another. The aim and purpose of 
pure science lies in the conclusions which build up knowledge. Yet these con- 
clusions are reached because individual scientists decide to attack certain 
problems in certain ways. (They rarely, if ever, know enough to conclude 
which problems they should attack, or how.) In most fields of engineering much 
must depend on the wisdom of experience, on engineering judgment, on engineer- 
ing decisions. Yet these decisions are built upon the conclusions of pure and 
applied science. Engineering uses decisions fortified with conclusions, just as 
science uses decisions to reach conclusions. 

In statistics, too, conclusions and decisions are interrelated and intertwined. 
It is not infrequent that we come to conclusions about decision procedures. What 
may prove to be one of the greatest monuments to Abraham Wald’s memory 
is the notion of admissibility. And one of its more important elements is the fact 
that we may conclude (in this instance purely from theory and presuppositions) 
that one decision procedure is always worse than another. 

We have seen that point estimates may reasonably be regarded as decisions. 
If we have a situation in which alternative point estimates are investigated by 
experimental sampling, and if the sampling is continued until the effects of 
sampling fluctuations fall below a prechosen standard of smallness, we are really 
experimenting until we can reach a conclusion about competing decision pro- 
cedures. A third instance, closer in feeling to the first, is provided by R. A. 
Fisher’s classical paper of 1920, “‘A Mathematical Examination of the Methods of 
Determining the Accuracy of an Observation by the Mean Error, and by the 
Mean Square Error’ one of many objective comparisons of estimators. 

On the other hand, all of us make decisions about conclusion procedures. 
Some of us do it every day. “How is it best to analyze this data?” is a question 
which cannot be left to the experimenter alone, which the statistician is bound 
by his profession to try to answer. If the answer should clearly be a procedure 
to provide a conclusion, then he must do something about a conclusion procedure. 
Does he decide about it, or conclude about it? As an inherent conservative 
(professionally, anyway!), he would like to conclude. But will he have enough 
firm evidence? Often he will not! 

When a transformation is chosen, whether for an analysis of variance, for a 
quantal response assay, or for some other statistical procedure, how often does 
the chooser know what is the best transformation? In a theoretical sense, the 
answer is “‘never’’, for he will have only a finite amount of information—since 
his estimate of “‘best”’ will have a finite standard deviation—and transformations 
can be varied in arbitrarily small steps. In practice, it must be recognized that 
exactly the best transformation is not required, so that such an argument is not 
compelling. Yet, even in a practical sense, the answer is “‘not nearly often 
enough”, for adequate information is often, or even usually, lacking. Who 
knows of an instance, to take a concrete example, where the choice between 
probits and logits for a quantal response assay was a conclusion and not a 
decision? 

In handling complex data by analysis of variance, how shall we set up the 
analysis? How detailed shall be our computations? On what orthogonal functions 
shall we calculate regressions? Can any of you recall situations where the answer 
to any of these was a conclusion? 
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APPENDIX 2 
CoNCLUSION THEORY AS AN ACTION SYSTEM 


Insofar as man’s organized activities can be regarded as striving toward at 
least dimly recognized goals, it is easy to argue that the individual actions 
which make up these activities should be guided by some appropriate form of 
decision theory. Actions are to be taken in specific instances, and the gains or 
losses resulting from specific combinations of actions and states of nature can, 
in principle, be at least roughly assessed. Why then should there be a place for 
conclusion theory, which seems from such a broad viewpoint to be a poor substi- 
tute for what is really needed? At least four classes of important reasons loom 
up over the horizon: problems of communication, problems of assessment of 
gain and loss, problems of assessment of the a priori, problems of adequate 
mathematical treatment. 

Most human affairs are not conducted by a single individual, nor even by a 
single executive hierarchy. Science, in the broadest sense, is both one of the most 
successful of human affairs, and one of the most decentralized. In principle, 
each of us puts his evidence (his observations, experimental or not, and their 
discussion) before all the others, and in due course an adequate consensus of 
opinion develops. In the early decades of the Royal Society of London, this was 
indeed very nearly how things were. But the number of working scientists has 
doubled, and redoubled very many times since then. As a consequence, problems 
of communication have probably come to dominate the problems of scientific 
method. And the practices of science have developed to meet the challenge. 
Outstanding among these practices is the use of conclusions. A scientist is 
helped little to know that another, given different evidence and facing a different 
specific situation, decided (even decided wisely) to act as if so-and-so were the 
true state of nature. The communication (for information, not as directives) of 
decisions is often inappropriate, and usually inefficient. A scientist is helped 
much to know that another reached a certain conclusion, that he felt that the 
correctness of so-and-so was established with high confidence. In order to 
replace conclusions as the basic means of communication, it would be necessary 
to rearrange and replan the entire fabric of science. No statistician should dare 
to attempt such a task on the basis of his limited area of specialized knowledge. 

But suppose a new fabric of science were to be developed. How could the old 
be compared with the new? Let us admit for simplicity that rapidity of progress 
is what is desired. (To do this for argument’s sake alone does not mean that the 
intellectual and artistic aspects of science are being neglected in comparison 
with its pragmatic aspects.) Can one judge now how far science will progress 
(using the old fabric) in twenty years? And if not, how could we judge whether 
twenty years’ use of a new fabric had done better or worse? If twenty years of 
trial would not be adequate for evaluation, how can an advance assessment 
give any useful idea of the gains and losses to be expected from a change to a 
new fabric? 

If there were to be a change to a new fabric of science, one based more ex- 
plicitly cn decision-theoretic principles, how would the choice among many 
such fabrics be made? There would be a need to choose something like an a 
priort state of the whole world, more precisely to choose an a prior: distribution 
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of probability over all the possible states of the whole world, since just as the 
admissible decision procedures are the Bayesian solutions, (those solutions 
which are optimum for suitable assumptions about the a priori probabilities of 
all “states of nature” considered) so too the admissible decision fabrics are to be 
expected to be Bayesian fabrics. And it is a little too much to ask of those who 
have learned to study certain limited aspects of the world, and who are striving 
to learn a little more of these aspects, that they envisage all possible worlds and 
distribute probability among them. 

Finally, there are problems of adequate mathematical treatment. Statistics 
can solve a few vastly over-simplified problems in great generality or great detail, 
but it has barely begun to chew out a few little entrances into many problems of 
moderate difficulty. Problems of the order of difficulty of finding a Bayesian 
fabric, given the gains and losses, are wholly outside its present grasp. Today 
it has not provided even a beginning of an answer for such vastly simpler problems 
as: given samples of moderate size from each of two populations, given that the 
populations are so nearly normal (i.e. Gaussian) that samples of 1000 have no 
more than an even chance of detecting (at 5% significance) that the populations 
are not normal, and (even) given that the populations are symmetrical, what 
is the safest way to compare the centers of the two populations on the basis of 
the samples, where safety combines (1) reasonable reliability of significance or 
confidence percentages and (2) avoidance of procedures which are relatively 
very wasteful for particular population shapes. (Notes: (a) Even these many 
words, of course, have not completely specified a problem. (b) Adding a prob- 
ability distribution over shapes to the hypotheses seems unlikely to make the 
problem easier.) 

There are four types of difficulty, then, ranging from communication through 
assessment to mathematical treatment, each of which by itself will be sufficient, 
for a long time, to prevent the replacement, in science, of the system of con- 
clusions by a system based more closely on today’s decision theory. Once these 
four have been examined, the natural question becomes: “How did the conclusion 
system escape the parallel sets of difficulties?”. The answer is simple and clear. 
It grew. This means that it evolved; that many minor alternatives were tried, 
often unconsciously, that most were found wanting, and were discarded; that 
this process of trial and selection went through cycle after cycle. The strength 
of the process of science today comes from experience rather than insight, and 
this state of affairs may be expected to continue for a long time. Indeed, it will 
not be easy to gain the limited insight required to understand how the present 
processes of science do as well as they do. 


APPENDIX 3 


Wuat or Tests oF HypotuEsis? 


In view of Neyman’s continued insistence on “inductive behavior’, words 
which relate more naturally to decisions than to conclusions, it is reasonable to 
suppose that the Neyman-Pearson theory of testing hypotheses was, at the very 
least, a long step in the direction of decision theory, and that the appearance 
of 5%, 1% and the like in its development and discussion was a carryover from 
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the then dominant qualitative conclusion theory, the theory of tests of signifi- 
cance. If this view is correct, Wald’s decision theory now does much more nearly 
what tests of hypothesis were intended to do. Indeed, there are three ways in 
which it does better. First, it has given up a fixed probability for errors of the 
first kind, and has focussed on gains, losses or regrets (be they average or 
minimax). Secondly, it has made it somewhat easier to consider a much wider 
variety of specifications, to make much less stringent assumptions. And finally 
it has shown that one should expect mathematics to provide, not a single best 
procedure, but rather an assortment of good procedures (e.g. a complete class of 
admissible procedures) from which judgment and insight into a particular 
instance (perhaps expressed in the form of an a priori distribution) must be 
used to select the “best” procedure. 

If one aspect of the theory of testing hypotheses has been embodied in modern 
decision theory, what of its other aspects? The notion of the power function of a 
test, which is of course strictly analogous to the notion of the operating character- 
istic of a sampling plan, is just as applicable to tests of significance (conclusions) 
as to tests of hypotheses (decisions). And, indeed, its natural generalization to 
confidence procedures (conclusions) seems more natural and reasonable than 
such conventional criteria as the average length of confidence intervals. 

Conclusion theory can take over these nondecisional aspects of the theory of 
testing hypotheses. Its main concern in so doing must be caution about over- 
narrow specifications. To know that a certain confidence procedure is optimum, 
so long as the underlying observations follow a normal (i.e. Gaussian) distri- 
bution precisely, is not enough, if the procedure is poor for distributions whose 


shapes are very difficult to distinguish (in practice) from normality. (And such 
situations exist, at least in large samples, cp. (3).) 

In the long run, then, the theory of testing hypotheses can be absorbed into 
the contrasted bodies of decision theory and conclusion theory. (And the Neyman- 
Pearson lemma can serve, in its proper place, in both.) 
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This paper describes statistical methods for testing hypotheses about the mean 
of an exponential distribution of life. Advantage is taken of the time-ordered nature 
of life test data to shorten substantially the time required to reach a decision. Re- 
placement, non-replacement, sequential, non-sequential, and truncated procedures 
are described. Some useful tables are given at the end of the paper. 


I. INTRODUCTION 


It is a characteristic feature of most life and fatigue tests that they give rise 
to ordered observations. If, for example, 20 radio tubes are placed on life 
test, and ¢; denotes the time when the 7-th tube fails, the data occur in such a 
way that 4, < 4, < +++ < to . Exactly the same kind of ordered situation will 
occur whether the problem under consideration deals with the life of electric 
bulbs, the life of electronic components, the life of ball bearings, or the length of 
life of human beings after they are treated for a disease. The examples we have 
just given all involved ordering in time. This need not necessarily be the case. 
If we are interested in destructive test situations involving such things as the 
current needed to blow a fuse, the voltage needed to break down a condenser, 
the force needed to rupture a physical material, then we can often arrange to 
test in such a way that every item in the sample is subjected to precisely the 
same stimulus (current, voltage, stress). If this is done, then clearly the weakest 
item will be observed to fail first, the second weakest next, etc. In the present 
paper we discuss almost exclusively situations in which it is the time to failure 
that is the important random variable, and therefore we shall use the language 
of time throughout the paper. It should be emphasized, however, that there 
will be some practical problems which do not involve time, but for which some 
of the ideas discussed in this paper are quite relevant. 

Put in general terms, we test n items drawn at random from some population 
and the data become available in such a way that the smallest observation comes 
first, the second smallest, second, ..., and finally the largest observation last. 
Clearly we can, if we choose, discontinue experimentation long before all n 
items have failed. In particular we may decide to terminate the experiment as 
soon as we have the first r (< m) failures, or we may decide to stop at some 


* The preparation of this paper was supported in part by the Office of Naval Research. 
A more detailed presentation is given in Technical Report No. 3, ONR Contract Nonr- 


2163(00). The material will appear in unabridged form in a monograph being written by 
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preassigned truncation time 7, , or we may adopt a sequential procedure per- 
mitting us to stop as soon as certain conditions are met. In all of these cases our 
primary concern is the development of statistical procedures which, by taking 
advantage of the fact that data become available in order, will enable the 
experimenter to reach a decision in a shorter time or with fewer observations 
than would be possible if data did not become available in a time-ordered way. 

In this paper we make the assumption that the length of life X has an exponen- 
tial distribution described by the probability density function (henceforth 
abbreviated as p.d.f.) f(x; 0) of the form 






f(z; 0) = te" 


= 0, elsewhere. 


A partial justification for this assumption has been discussed in [1] and several 
relevant references are given there. 























II. Lire Tests DIsconTINUED AFTER A FIXED NUMBER OF FAILURES 


The following is shown in [2]: Let n items be drawn at random from a dis- 
tribution whose p.d.f. is given by (1) and placed on life test. Let the observations 
become available in order, i.e., %:.. < Ten S *** S em S *** S Lan, Where 
by 2;,. is meant the time when the 7-th failure occurs. Suppose that experi- 
mentation is discontinued as soon as the r-th item fails (r is preassigned), then 
the maximum likelihood estimate of the mean life 6 is given by 6,,, , where 


6. = Minh Baw Ft + rn + — En, 
rn Tr 


(2) 


(6 is the mean life since E(X) = [ x se ‘dz = 0.) 
0 


In words, we add up the total number of hours lived by all items, those that 
failed and those that did not fail, and divide by the number of failures. The 
estimate 6,.,, is “best’’ in the sense that in addition to being maximum likelihood, 
it is also unbiased, minimum variance, efficient, and sufficient. The p.d.f. of 
6, . is given by 


f(y) = (r/o'ye"""*, y>0, 
¢-D! ay) (3) 


= 0, elsewhere, 


and 2r6,,,/0 is distributed as chi-square with 2r degrees of freedom (which we 
denote as x’(2r)). 

In the preceding paragraph we have been concerned with the non-replacement 
situation where one does not replace failed items at once by new items drawn 
from the underlying p.d.f. (1). In the replacement case (where one immediately 
replaces a failed item by a new one) it can be shown that the maximum likeli- 
hood estimate of the mean life @ is given by 


6. .n — N2L,,n/t, (4) 
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where by z,,, is meant the time (measured from the beginning of the life test) 
to observe the r-th failure and where the sample size n is maintained throughout 
the life test. It should be remarked that nz,.,, is the total number of hours lived 
by all items on test since 


NL in = NX ,2 + (22,0 oa 21.2) + n(L3.n Pa Lo,n) + on + Wa... a Rena (5) 


On the righthand side of (5), nz,,, is the number of hours lived by all items up 
to the time the first failure occurred, and n(x; ,, — 2;-1,,) is the number of hours 
lived by all items between the times of occurrence of the (¢ — 1)st failure and 
i-th failure. The estimate (4) in the replacement case has precisely the same 
distribution and the same optimum properties as does the estimate (2) in the 
non-replacement case. In fact, if we let 7',,, be the total number of hours lived 
by all items whether they failed or not, up to the time when the r-th failure 
occurred, one can write both (2) and (4) as 


6... = T,./P (6) 


where 


Teen = Tin + L2,n + mee + Lr-1,0 + (n =< = + De. 


in the non-replacement case, and where 
T 5 = Nee 


in the replacement case. In either case, 27’,,,/0 is distributed as x°(2r). 

Suppose we want to find a test procedure which will give a prescribed operating 
characteristic curve (henceforth abbreviated as OC curve). Put in statistical 
terms, (0) is some acceptable (high) mean life; @, is some unacceptable (low) 
mean life; a is the producer’s risk, and 8 is the consumer’s risk) we want to test 
the hypothesis H, : @ = 0 against the alternative H, : 8 = 0, < 4 subject to 
the conditions that for @ = 0, and 6 = @, , respectively, L(@.) = Pr (accepting 
6 = 0 | 6 is true) = 1 — a, L(6,) = Pr (accepting 6 = 6, | 4, is true) < 8. 
It is shown in [2] that the region of acceptance for 6 = 6, must be of the form 


6... > C = Oxi-a(2r)/2r, (7) 


where r is the smallest integer such that xj_.(2r)/x$(2r) > 0,/@ . The approp- 
riate values of r (and hence C) for certain values of a, 8, and 6,/6, are given 
in Table 1. In the preceding formulae, x7_,.(2r)is the lower 100 a per cent point, 
and x%(2r) is the upper 100 8 per cent point of x’(2r). 

In the test procedure 6, ,, > C, the sample size n is at our disposal. The effect 
of increasing n is to shorten the time needed on the average to reach a decision 
and thus if we happen to be in a situation where the items being tested are 
cheap but where time is very valuable, we may well prefer a test of the form 
6, .. > C to one which is of the form 6, ,, > C. These two procedures have exactly 
the same OC curve, and our only reason for preferring a rule of action based 
on the first r failures out of n items tested to one based on failing all r out of r 
items is that the first rule will take a shorter time, on the average. Thus, for 
example, a test procedure which involves stopping an experiment after the 
first of two items on test has failed will lead to rules of action whose OC curve 
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TaBLE 1 
Values of r (upper numbers) and of x?_,.(2r)/2 (lower numbers) such that the test based 
on using 6r.n >C = Oox?_,(2r)/2r as acceptance region for @ = 00 
will have L{ 6») = ] — aand (4) < B. 


a= 01 a= 05 a = .10 
0/h B=01 B=05 B=.10 B=01 B=.05 B=.10 ' B=.01 B=.05 B=.10 


3/2 186 101 83 95 67 55 77 52 41 
110.4 79.1 63.3 79.6 54.1 43.4 66.0 43.0 33.0 


46 35 30 33 23 19 26 18 15 
31.7 22.7 18.7 24.2 15.7 12.4 19.7 12.8 


27 21 18 19 14 11 15 11 
16.4 11.8 9.62 12.4 8.46 6.17 10.3 7.02 


19 15 13 13 10 11 
10.3 7.48 6.10 7.69 5.43 


12 10 7 
5.43 3.51 4.70 3.29 


7 7 
2.33 3.29 1.97 


Bs 3 
823 1.37 -818 818 


is precisely the same as that found by placing one item on test and waiting 
until it fails. However, the expected length of time in the first procedure is 
only one-half that in the second procedure. Consequently, if the time saved 
outweights the loss due to testing two items rather than one, we would prefer 
the first procedure. 

Let E(X,,.) be the expected length of time needed to observe the first r 
failures out of n items placed in test, and let E(X,,,) be the expected length of 


time needed to observe all r items to fail, if r items are placed on test; then the 
ratio 


arn = E(X,.»)/E(X,,+) (8) 


is a measure of the expected saving in time due to using the first procedure 
as compared with the second procedure. In Table 2 we give the values of this 
ratio for selected small values of r and n in the non-replacement case. This 
table shows that if “time is money,” procedures which terminate before the 


whole sample is observed may be very advantageous. In evaluating (8), the 
following formulae are useful: 


1 1 1 . , 
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TaBiz 2 
Ratio of the expected waiting time to observe the r-th failure in samples of size n and r, 
respectively. E(X+n)/E(X+,r) = arn 


n 


5 


20 
30 
43 
-62 

1 


in the case where failed items are not replaced, and 


E(X,,..) = 70/n (10) 


in the case where failed items are replaced at once by new items drawn from 
the p.d.f. (1). 


III. Time Truncatep Lire Tests 


It is frequently necessary on practical grounds to terminate a life test by a 
preassigned time 7, . This leads to truncated tests in which it is decided in 
advance that the life test will be terminated at min (X,,,, ; To), where X,,,, is the 
time at which the-r,-th failure occurs and T, is the truncation time beyond 
which the life test will not be allowed to run. If the life test is terminated at 
X,. (i.e., To failures occur before time 7), then the action taken will be to 
reject. If the experiment is terminated at time 7, (i.e., the ro-th occurs after 
time T,), then the action in terms of hypothesis testing is acceptance. In [3], 
one can find details concerning such test procedures for both the replacement 
and non-replacement cases. These test procedures are characterized by three 
functions, E,(r), E,(T), and L(6@): the expected number of observations to reach 
a decision, the expected waiting time to reach a decision, and the probability 
of accepting, respectively, if @ is the true value. The formulae are given below. 

In the non-replacement case, 


E,(r) = nol 5: b(k;n — 1, ps | + rt 7 b(k; n, ps |, (11) 


where 
p=1—e 7” and b(k;n, pm) = (")pic — pi)". 
The probability distribution of r is given by 


Pr (r = k | 0) = b(k; n, po), k=0,1,2,°++ ,%m—1 (12) 
and 
re-l 


Prr=n|/)=1- OPr¢r=k| #. (12’) 
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Further, one has 


EAT) =D Pre = k| OE%.) 
where E,(X;,,,) can be found from (9), and 
L(6) = Tre ble, . (14) 


In the replacement case, the probability distribution of r is given by 


Pr (ry = k | 6) = p(k; Xo), k= 0,1,2,--+,m—1 (15) 
and 


Pr(r =1| 6) =1— = p(k; do). (15’) 

In (15) and (15’), Xp = nT,/0 and p(k; 44) = db exp — (s)/k! . Further, one has 
E,(r) = Xs = (Kk; Xe) + ri ~ > p(k; a» (16) 

ET) = 6E,(r)/n, (17) 


L(@) = > p(k; Ws). (18) 


We have just given formulae for the OC curve, the expected waiting time, 
and expected number of items failed in the course of reaching a decision for 
any preassigned n, 7’) , ro . We now give a formula for finding the appropriate 
truncated test (that is, for finding r, and m) when the truncation time 7’ is 
preassigned, and the OC curve is required (for preassigned type I error, a, and 
type II error, 8) to be such that L(@)) = 1 — a and L(6,) < 8B. It is shown in 
[3] that for both the replacement case and the non-replacement case, the appro- 
priate ro is precisely the same as the r, used in tests of the form (7). Hence 
Table 1 can be used. In the replacement case, the appropriate value of n is 
given by 


ie [Ooxi-a(2ro)/ 27], (19) 


where [x] means the greatest integer < x. 
In the non-replacement situation, a good approximation for n, in case 
6./T > => 3, is given by 


n = [r/(l — 7"), (20) 
where 


C= Boxi- a(2ro) / 2ro . 


IV. SEQUENTIAL Lire TESTS 


One can make substantial improvements on the procedures described in 
Sections II and III by following a sequential procedure. It is shown in [4] that 
the sequential probability ratio test can be applied to life testing. It is very 
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interesting that decisions can now be made continuously in time. At each moment 
t, one can decide either to accept, to reject, or to continue the life test. If we are, 
as before, testing Hy : 6 = 6) against H, : 0 = 0, (@ > 6,) with type I error = a 
and type II error = 8, then the decision as time unfolds depends on 


B < (60/0,)" exp — {(1/0, — 1/&)V()} < A, (21) 
where A and B, for all practical purposes, can be taken as 
A=(1-—8)/a and B= 86/1 —a). (22) 


In (21), r is the number of failures observed by time #. The decision to continue 
experimentation is made as long as the inequality (21) holds. As soon as (21) 
is violated, one accepts H, if the function of ¢ in (21) is < B, and one rejects 
H, (accepts H,) if the function of ¢ in (21) is > A. 

In (21), V(é) is a statistic which equals the total number of hours lived by 
all items, failed and unfailed, up to time ¢. In the replacement case, 


V(t) = nt, (23) 
while in the non-replacement case’, 


VW) = Va-it+t VE -a%-)+a-Nt-2)= Data—nt. (24) 

(In the non-replacement case it may happen that no decision has been reached 
by the time ¢ = 2, items have failed. This will then require that we either put 
more items on test and wait until (25) is violated or else have a rule which 
will tell us how to terminate the experiment and with what decision at ¢ = x, . 
Fortunately n is often at our disposal and so can be chosen sufficiently large 
so that the probability of reaching no decision by time z, is negligible.) 

The inequalities (21) can be rewritten as 


—h, +1rs < V(t) < hy +78, (25) 
where hy , h; , and s are positive constants given by 


h ina, h pee: o= In (60/61) ’ 
° 1/0, — 1/8’ : 1/6, — 1/4’ 1/0, — 1/4% 


In the form given by (25), it is easy to carry out the sequential procedure 
graphically. 

The OC curve—that is, the probability of accepting H, when @ is the time 
parameter value—is given approximately by a pair of parametric equations: 


(26) 


19 =A=1,, oa selet=1 


i} = (1/6, — 1/0)’ (27) 


by letting the parameter h run through all real values. 

The values of L(6) at the five points, @ = 0, 0, , s, ® , ~, enable one to sketch 
the entire curve. These values are respectively 0, 8, In A/(In A — In B), 1 — a, 
and 1. 


E,(r), the expected number of observations required to reach a decision 
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when @ is the mean life, is given by 


hy se L(8)(ho + hi) ; 68 
Ei) ~ at (28) 


Boll =x 
2 6=8 


If we let k = 6,/6, , the approximate values of E,(r) become particularly 
simple when 6 = 06, , 8, or 0). They are 


E,,¢) ~ [6 In B + (1 — 8) In A)/[In k — (& — 1)/k), 
Ef) ~ —n AlnB/(nbk, (29) 
E,(r) ~ (1 — a) nB+aln A)/[Ink — (k — 1). 


In Table 3, we give E,(r) for five values of 6 (0, 6; , 8, % , ©), for four values 
of k (3/2, 2, 5/2, 3), and for the four number pairs (a, 8) which can be made 
with the numbers .01 and .05. 

It can be shown that E,(t), the expected waiting time to reach a decision, is 
given by the formula 


E,(t) = E,(r)6/n (30) 


in the replacement case. In the non-replacement case, 


Ei) = Pre = k| OEA%) (31) 


TABLE 3° 


Approximate values of E,(r) for sequential tests for various 
values of k = 00/0: and a, B. 
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where E,(X,,.) can be found from (9). A good approximation for E,(#)is given by 


Ei) ~ 6 n(—*-). (32) 


The derivations of all formulae in this section are given in [4]. 


V. NuMERICAL EXAMPLES 


1. Design a censored life test which will meet the following conditions: 
When 6 = 1500, L(@) = .95, and when 0, = 500, L(6,) < .05. 

Solution: In this problem, 0,/0, = 3, a = 8 = .05. According to Table 1, 
this means that r = 10. Since x’.95(20)/2 = 5.43 (see Table 1), we get by sub- 
stituting in (7) that the acceptance region is given by 


bio.n > Ooxi-a(2r)/2r = (1500)(5.43)/10 = 815. 


Hence the censored life test is as follows: Place n items on test, discontinue 
testing after the first 10 items have failed. Compute 6,o,, . If 6 > 815 hours, 
accept 6 = 0) = 1500. If 6 < 815 hours, reject 6 = 6). It can be computed that 
for this procedure L(@) = .95 and L(6,) = .038. 

2. Find a truncated replacement plan for which 7, = 500 hours, which 
will accept a lot with mean life @ = 10,000 hours at least 95 per cent of the 
time, and reject a lot with mean life 6, = 2,000 hours at least 95 per cent of 
the time. Compute L(@), Z,(T), and E,(r) at @& = 10,000 and 6, = 2,000 respec- 
tively. 

Solution: Since 6,/0, = 5, a = 8 = .035, it follows from Table 1 that r, = 5. 
Substituting in (19), we get n = 39. Thus the following truncated replacement 
plan meets the requirements: Start the life test with n = 39 items. As soon 
as one item fails, replace it by a new item. Accept the lot if min (X5,35 ; 500) = 500 
(i.e., if fewer than 5 failures occur before 500 hours), and reject the lot if 
min (X5,39 ; 500) = Xs5,29 (ie., if the fifth failure occurs before 500 hours). If 
the lot is rejected, life testing is stopped at X;5,35 , the time of occurrence of the 
fifth failure. 

For 6 = 10,000, \» = nT,/@ = (39)(500)/10,000 = 1.95. Using Molina’s 
tables, one finds from (18) that Z(@) = .952. Substituting in (16) and (17), 
respectively, gives E,(r) = 1.93 and E,(T’) = 495. For @ = 2,000, \y = nT,/@ = 
9.75. For this value of 6, L(@) = .034, E,(r) = 4.95, and E,(T) = 254. 

3. If the word ‘“‘non-replacement”’ is substituted for “replacement” in Problem 
2, what is the appropriate value for n, the number of items on test? 

Solution: Substituting in (20), using rp = 5, T> = 500, 0 = 10,000, we get 
n = 42, 

4(a). Find a sequential replacement life test which will accept a lot with mean 
life 6. = 1,500 hours 95 per cent of the time, and will reject a lot with mean 
life 6, = 500 hours 95 per cent of the time. The constant number of items under 
test isn = 20. 

(b) Compute E,(r) and E,(t) for 6 = 0, 0,(= 500), s(= 823), @(= 1,500), 
and o, 
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Solution (a): Substituting in (21), we get 


- r —t/37.5 
19 < 3’e < 19. 


In this case, (25) becomes 
—110 + 4Ir < ¢ < 110 + 41r, 


where # represents the length of time that the life test has been in progress and 
r denotes the number of failures obtaified up to time ¢. If, at the time of stopping, 
t is less than the left-hand member of the inequality, we reject 0. = 1500 (accept 
6, = 500); if, at the time of stopping, ¢ is greater than the right-hand side of 
the inequality, we accept 4 = 1500. 

(b) From Table 3, we obtain 


E,v) = 3, £,,(r) = 6.14, E,7) = 7.18, Er) = 2.94, E.(r) = 0. 


In the replacement case, E,(¢) is found most easily for all values of 6(~ o) 
by using (30). This gives: E,(#) = 0, E,,(4) = 155 hours, E,(#) = 295 hours, 
E,,(t) = 220 hours. For 6 = ©, the expected waiting time is given by ¢.. , where 
e~*°/87-5 — 1/19. This gives t. = E.(#) = 110 hours. 

5(a). Find a truncated (non-sequential) replacement procedure for testing 
the hypothesis in Problem 4, using a constant sample size n = 20. 

(b) Compute E,(r) and E,(¢) for this plan for 6 = 0, 6, , 8, 0), ©. 

Sotution (a): Let us look at Formula (7). In the replacement case, 6, , = nx, ,./T 
and hence the region of acceptance for 6 = 6) for H, becomes: accept if 
ein > O0Xi-a(2r)/2n. 

In the problem under consideration, 4. = 1,500, @, = 500, a = 8B 
n = 20. Hence, r = 10, x3_.(20)/2 = 5.43. The rule of action becomes: 


Accept if 19,2 > 407.5; reject otherwise. 


This is a truncated rule of action with rp = 10, JT) = 407.5. 
(b) Using formulas (16) and (17), we obtain 


Er) = 10, E,,(r) = 9.98, E,@7) = 8.73, E,.) = 5.39, E.(r) =0 
and 


E(t) =0, E,,(é) = 248, £,(é) = 360, £E,,(t) = 404.5, E.(t) = 407.5. 


6. Assume that we are testing the hypothesis in Problem 4. A sample of size 
20 is placed on test. Items which fail are replaced at once by new items from 
the same lot. The experiment is started at time ¢ = 0 and the first 5 failures 
occur at x, = 20.1 hours, x, = 100.5 hours, x; = 121.7 hours, x, = 167.4 hours, 
and x; = 179.2 hours, all times being measured from ¢ = 0. 

(a) Verify that no decision has been reached by time 2; . 

(b) Verify that if the sixth failure has not yet occurred at 315 hours, we can 
stop life testing at that time with acceptance of H, , namely that 6 = 1,500. 

Solution: We remarked in the solution to Problem 4 that (25) becomes 


—110 + 4Ir < ¢ < 110 + 4lIr. 


The region is drawn in Figure 1. The life test data are plotted by moving vertically 
so long as we are waiting for the next failure to occur, and moving horizontally 
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t = 4lr +110 


t= hir - 110 


Number of Failures 


Figure 1—Number of failures 


by one unit (in r) at each failure time. In Figure 1, the path crosses into the 
region of acceptance when r = 5 at time ¢ = 110 + (41)(5) = 315 hours. Since 
the sixth failure has not yet occurred, we can stop life testing at ¢ = 315 hours, 
with acceptamce of H, . 

Remark: Suppose we happen to know that the sixth failure occurs at x, = 346.7 
hours. Then, as indicated in Figure 1, we saved 346.7 — 315 = 31.7 hours by 
virture of the fact that life test data were available continuously in time. 

7. The first six failure times in a sample of 20 (with replacement) are x, = 19.3, 
L, = 42.8, x, = 49.9, x, = 96.7, x5 = 115.2, x, = 127.7. Verify that if the hypoth- 
eses being tested are those in Problem 4, then H, is rejected at time x, = 127.7 
hours. 


Solution: x, , %2 , X3 , % , and 2; all fall within the region bounded by the 
two straight lines, 


—110 + 4Ir < ¢ < 110 + 4lr. 


However, when r = 6, —110 + 4lr = 136. Since x, = 127.7 < 136, Hp is rejected 
at time x, = 127.7 hours. A graphical solution is given in Figure 2. 





t= ‘ir +110 


t= bir - 110 


Ficure 2—Number of failures. 


VI. ConcLusIoNn 


We have not attempted in this paper to cover all of the papers which have 
been published in the field of life testing. We have selected essentially three 
papers [2, 3, 4] which give some of the most important results. A careful reading 
of these papers gives a good introduction to the statistical methodology involved 
in life testing. We have given several numerical examples to illustrate the theory. 
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Estimation from Life Test Data* 


BENJAMIN EPSTEIN 
Wayne State University and Stanford University 


In this paper four estimation procedures are discussed which are useful if one 
wishes to make point or confidence interval estimates from life test data. The first 
two procedures can be applied when the underlying density function of life is expo- 
nential. The third and fourth procedures are non-parametric. Each of the procedures 
is illustrated by means of numerical examples. 


ESTIMATION PROCEDURES FOR THE EXPONENTIAL DISTRIBUTION 


We first discuss problems of estimation where the life, X, is described by 
the p.d.f. 


f(x; 0) = ae, z>0, 9@>0. (1) 


Case I: Life testing is discontinued after a fixed number, r, of items have 
failed. Items on test may or not be replaced. The number of items initially on 
test is n. 


The “‘best’’ estimate of the mean life, 0, is given by 


6,4 = Pelt, (2) 
where 7',,, is the accumulated life on test until the r’th failure occurs. The 
observed failure times are 2, < Zan << *°* S< Uy S ++ . For simplicity of 
notation, we will suppress the subscript n. 


If testing is terminated after r(< mn) failures have occurred then, in the non- 
replacement case: 


T, = > 2, +a —r2,. (3) 


t=1 
In the replacement case (where r is now unrestricted): 
T, = N2, . (4) 


The probability density function of 6, , in either the replacement or non- 
replacement case, is given by: 


f(y) = eg (£) yee ~ oe (5) 


= 0, elsewhere. 


* Work done with the partial support of the Office of Naval Research. A more detailed 
presentation of the first three estimation procedures is given in sections 1, 2 and 3 of Techni- 
cal Report No. 4, ONR Contract-Nonr-2163(00). The material will appear in unabridged 
form in a monograph being written by the author. 
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From (5) it follows that 


2rd, _ 27, 
oo ie 


(i.e., as chi square with 2r degrees of freedom). 


From (6) it follows that a two-sided 100(1 — a) percent confidence interval 
for @ is given by: 






is distributed as > (2r) (6) 


ars QT. 
Xa/2(2r) ea Xi-a/2(2r) , @) 


where x2,/2(2r) is the upper a/2 percentage point of x7(2r) and xj_./2(2r) is 
the lower a/2 percentage point of x°(2r). Similarly a one-sided 100(1 — a) 
percent confidence interval for 6 is given by 
QT. 
0 > 230r) ’ ® 
where x2(2r) is the upper a percentage point of x°(2r). 
Put into words, 100(1 — a) percent of assertions of the kind made in (7) 
and (8) will be correct. 


Example 1: 


20 electron tubes are placed on test. A tube which fails is replaced at once 
by a new tube. The fifth failure is observed to occur 407 hours after the start 
of the life test. 

Estimate the mean life @ and give one and two-sided 95% confidence intervals 
for 0. 

Solution: We are dealing with a replacement situation with n = 20, r = 5, 
xs; = 407. The total life observed is, according to (4), given by 7, = 20 x; = 
20(407) = 8140. Thus it follows from (2) that 6 = T,/5 = 8140/5 = 1628 
hours. To find a two-sided 95% confidence interval for @ we use (7) with 
X7025(10) = 20.483 and x7o75(10) = 3.247. Substituting in (7) we get the two- 
sided 95% confidence interval 795 < @ < 5014. To find a one-sided 95% con- 
fidence interval we use (8) with x”,,(10) = 18.307. Substituting in (8) we get 
the one-sided 95% confidence interval 6 > 889 hours. 

Frequently we are not only interested in estimating 0, but also in estimating 
a quantity x, where 2, is that life such that 


Pr (X > 2,) = p. (9) 


For the exponential distribution 


1 
» = 8 log = 10 
x. gD (10) 


(All logarithms used in this paper are natural logarithms.) 
Maximum likelihood estimates and 100(1 — a) percent confidence intervals 
for x, are given by: 


1 
4, = 6, log — 11 
eS (11) 
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oT log + 27, log t 
<2< 
Xe/2(2r) Fe S lan 


in the two-sided case and 


1 
2T, log — 
acai 


% > Qh) i 


in the one-sided case. 
Formula (13) can be interpreted as follows: 


We can be 100(1 — a) percent confident of the truth of the assertion that 
the probability of surviving 


r = 2T, log 7) x32) time units is > p. 


This is a tolerance interval statement in that we can be 100(1 — a) percent 
confident of the correctness of the assertion that the fraction of items in the 
population surviving 7 or more time units is > p. Putting the last statement 


into reliability language we can be 100(1 — a) percent confident that the reli- 
ability over (0, 7] is > p. 


Example 2: 
Given the data in Example 1, estimate z., , where x., is such that 
Pr(X¥ >2,.) = 9 
(i.e., the probability of surviving for z., hours is .9). Give one and two-sided 


95% confidence intervals for x... 
Solution: log 1/p = log 1/.9 = .1054. Hence, substituting in (11) we get 


Z.9 = (1628)(.1054) 
= 172 hours. 


Substituting in (12) we get the two-sided 95% confidence interval: 83.8 < 2.5 < 528 


and substituting in (13) we get the one-sided 95% confidence interval z., > 93.7 
hours. 


Example 3: 


Given the data in Example 1, find a number 7+ such that we can assert with 
95% confidence that at least 90% of the items in the population survive + 
hours. (Note that this is a tolerance statement. It can also be given a reliability 
interpretation.) 

Solution: We have noted above that one-sided 100(1 — a) percent confidence 
statements regarding x, are also tolerance statements in which we can have 
100(1 — a) percent confidence. Hence, it follows from the solution to Example 
2, that we can assert with 95% confidence that at least 90% of the items in 
the population survive r = 93.7 hours. 

We are frequently interested in making point and interval estimates about 
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the probability that the item survives a preassigned length of time f/*. Denoting 
this by p,. , we have: 


py = Pr(X > f*) =e”, (16) 


It is obvious how one can make point and interval estimates of p,. from the 
corresponding formulae for 6, (see formulae (2), (7), and (8)). In particular a 
one-sided 100(1 — a) percent confidence interval for p,. is given by 


Pir > exp (—x2(2r)t*/2T,). (17) 


The question may be asked: How large should the observed 7’, be in order that 
we be 100(1 — a) percent confident that 


pose" >? (18) 
From (17) this implies that 
exp [—xa(2r)t*/2T,] > v (19) 


T, > x2(2/)t*/2 log : (20) 


The interpretation of (20) is as follows: 

If the total life observed in getting r failures exceeds x2(2r)t*/2 log 1/7, then 
we can be 100(1 — a) percent confident of the assertion that the probability 
of surviving time ¢* is > y. In reliability consideration we can replace the words 
“probability of surviving time ¢* is >” by “reliability over a time interval 
of length /* is > y’”. 


Example 4: 


Given the data in Example 1, make one and two-sided 95% confidence state- 
ments for the probability of surviving * = 100 hours. 

Solution: The maximum likelihood estimate of p,. , the probability of surviving 
t* = 100 hours, is given by 


Dro = GO es gO OANA. 

Similarly, a two-sided 95% confidence interval for p,. is given by 
(e7 2007795 |g 100/014) — (8817 < pe < .9802). 
A one-sided 95% confidence interval for p,. is given by substituting in (17). 
This gives us 
pit > .8936. 

We can be 95% confident of the assertion that the probability of surviving 
100 hours (reliability over the time (0(100)) is > .8936. 
Example 5: 


The total life observed in obtaining 5 failures is 9205 hours. On the basis of 
this information, can we be 95% confident that the probability of surviving 
(reliability) for a time * = 100 is > .90? 
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Solution: From (20) it is known that in order to be 95% confident that the 
probability of surviving for a time /* is > .9, it is necessary that the total observed 
life 

T, > x'us(10)100/2 log — = 8689. 


Since the total life observed in obtaining 5 failures is 9205, we can answer in 
the affirmative, i.e., we can be 95% confident that the probability of surviving 
for a time #* = 100 is > .90. 

Case II: Underlying distribution is exponential. The life test is discontinued 
after a fixed amount of total life T has elapsed. Items under test may or may not 
be replaced. (In the important special case where n items are tested, with replace- 
ment, for a length of time /*, T = né*.) 

In what follows let r = number of items which fail in [0, T], then some formulae 
of interest are: 


Two-sided 100(1 — a) percent confidence interval for 0 


or a 
Xe2(2r + 2) Xi-e/2(2r) 


One-sided 100(1 — a) percent confidence interval for 0 
0 > 2T/x2(2r + 2). (22) 
One-sided 100(1 — a) percent confidence interval for the quantity x, = 0 log 1/p. 


<6< (21) 


x, > 2T log 5[xster + 2). (23) 


If we define r as 
r = QT log o/x3l2r + 2), (24) 


then we can assert with 100(1 — a) percent confidence that at least 100p percent 
of the items in the population survive for a length of time r. Putting the last 
statement into reliability language we can be 100(1 — a) percent confident 
of the truth of the assertion that the reliability over [0, 7] is > p. 


Example 6: 


30 items are placed on test. Items which fail are replaced. The life test is 
stopped after 100 hours have elapsed. Five failures were observed in the course 
of the experiment. Assuming that the underlying distribution of life is expo- 
nential, find one and two-sided 95% confidence intervals for 6. 

Solution: In this problem the fixed amount of total life observed is 
T = nt* = 30(100) = 3000. Substituting in (21) and using x7o25(12) = 23.337 
and x7575(10) = 3.247 one gets the two-sided 95% confidence interval 


257 < 6 < 1848. 


Substituting in (22) and using x*,,(12) = 21.026, we get the one-sided 95% 
confidence interval 


6 > 285. 
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Example 7: 


Given the data in Example 6. Estimate 7 so that we will be 95% confident 
that the probability of surviving 7 hours is at least .9. Substituting in (24), 
and using 7’ = 3000, r = 5, a = .05, p = .9, we get 


_6000_ 
= 31.026 (.1054) = 30.1. 


On the basis of the data we can be 95% confident that the probability of surviving 
rt = 30.1 hours is > .9. 


T 


Two Non-PARAMETRIC ESTIMATION PROCEDURES 


Case III: n items are placed on life test for a time é*. At the end of this time 
one counts the number of items that have failed in [0, ¢*]. Items that fail are not 
replaced. 

In what follows let y = number of observed failures. Then we can make the 
following non-parametric statement: 

We can assert with 100(1 — a) percent confidence that at least 100 b percent 
of the population survives for a length of time ¢* with b given by 


v=[1 + (2 = Yr (2r + 2, 2n — an | , (25) 


Put in reliability language we are 100(1 — a) percent confident of the assertion 
that the reliability over [0, /*] is > b. 

In the particular case where the underlying distribution is exponential a 
one-sided 100(1 — a) percent confidence interval for 6 is given by 


@> e| og {1 + (4 + 1p F,(2r + 2, 2n — an} | , (26) 


In (25) and (26), F.(2r + 2, 2n — 2r) is the upper a percentage point of the 
F(2r + 2, 2n — 2r) distribution. 


Example 8: 


20 items are placed on life test for 100 hours. Two items fail before this time. 
Items which fail are not replaced. 

(a) Make a non-parametric one-sided 95% confidence statement about the 
probability of surviving 100 hours. 

(b) If the underlying distribution is exponential, find a one-sided 95% con- 
fidence interval for the mean life, @. 


Solution: 


(a) In this problem n = 20, r = 2, a = .05, /* = 100. Since F .o,(6, 36) = 2.36, 
it follows from (25) that b = .718. Hence, we can make the following non- 
parametric statement: 

We are 95% confident of the assertion that the probability of surviving 100 
hours (reliability over 100 hours) is > .718. 
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(b) Substituting in (26) we get the one-sided 95% confidence interval, @ > 302 
hours. 

Ezample 9: Ten thousand one hour missions are carried out. Ten failures 
are observed. Make a one-sided 95% confidence statement about the reliability 
(probability of success) in a one hour mission. 

Solution: In this problem n = 10,000, r = 10. Substituting in (25) we get 
b = .9983. We can be 95% confident that the reliability in a one hour mission 
is > .9983. 

Remark: In carrying out the computations we use the fact that since n is 
large F .o3(22; 19980) ~ F .o5(22, ») = 1.54. 


Example 10: 


Suppose that 10,000 one hour missions are carried out and that no failures 
are observed. Find a one-sided 95% confidence interval for the probability of 
mission survival. 

Solution: Substitute in (25) with n = 10,000, r = 0. 


F .s(2; 20,000) ~ F.os(2, ©) = 3.00. 


Hence, b = .9997. We can have 95% confidence in the assertion that the true 
probability of mission survival (reliability) is > .9997. 

Case IV: Items are drawn at random from some population and tested one 
at a time. Each item is tested for a length of time /*. An item is called a failure, 
if it fails before time ¢*. Otherwise it is called a success. (It should be pointed 
out that if the item on test is being monitored continuously, then a failed item 
is under test for a length of time < /*, while successful items are on test for 
exactly é* time units.) One continues testing items until a preassigned number, 
r, of failures occur. Suppose that the number of items tested in reaching the 
rth failure is n (i.e., the r’th failure is obtained when the n’th item is placed 
on test and not before). Then we make the following non-parametric statement: 

We can assert with 100(1 — a) percent confidence that at least 100b percent 
of the population survives for a length of time /* with b given by 


b= 1 + (—-)r.cer, 2n — a | (27) 


Put into reliability language we are 100(1 — a) percent confident of the asser- 
tion that the reliability over [0, /*] is > b. 

' In the particular case where the underlying distribution is exponential a 

one-sided 100(1 — a) percent confidence interval for @ is given by 


6> | og {1 + ( . rer, 2n — an} | (28) 


=F, 


Example 11: 


Items are drawn at random from some population and tested one by one, 
each item being tested for 100 hours. If an item fails in less than 100 hours it 
is called a failure. Otherwise it is called a success. We continue testing items 
until the second failure is obtained. Suppose that the second failure occurs 
when the twentieth item is placed in test. 
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(a) Make a non-parametric one-sided 95 percent confidence statement about 
the probability of surviving 100 hours. 

(b) If the underlying distribution is exponential, find a one-sided 95 percent 
confidence interval for the mean life, @. 

Solution: (a) In this problem r = 2, n = 20, a = .05, /* = 100. Since 
F o5(4, 36) = 2.63, it follows from (27) that b = .774. Hence, we can make 
the following non-parametric statement: 

We are 95 percent confident of the assertion that the probability of surviving 
100 hours (reliability over 100 hours) is > .774. 

(b) Substituting in (28) we get the one-sided 95 percent confidence interval, 
6 > 393 hours. 


Example 12: 


We continue to carry out a sequence of 1 hour missions, until the tenth un- 
successful mission occurs. Suppose that the tenth unsuccessful mission occurs 
on the ten thousandth mission. Find a one-sided 95 percent confidence interval 
for the probability of mission survival. 

Solution: In this problem r = 10, n = 10,000. Substituting in (27), we get 
b = .9984. We can have a 95 percent confidence in the assertion that the true 
probability of mission survival (reliability) is > .9984. 


Concluding remark: 
The reader should note the following: 


In Case I: _r is preassigned and 7’, is a random variable; 
In Case II: T is preassigned and r is a random variable; 
In Case III: n(and #*) are preassigned and r is a random variable; 
In Case IV: r(and /*) are preassigned and n is a random variable. 


It is useful to bear this in mind when comparing (7) with (21), (8) with (22), and 
(25) with (27). 
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A class of incomplete three level factorial designs useful for estimating the co- 
efficients in a second degree graduating polynomial are described. The designs either 
meet, or approximately meet, the criterion of rotatability and for the most part can 
be orthogonally blocked. A fully worked example is included. 


1.0. INTRODUCTION 


A symmetrical factorial design is an experimental arrangement in which a 
small integral number p of levels is chosen for each of k factors (i.e. variables) 
and all p* combinations of these levels are run. Classes of these designs which 
have proved to be of particular interest are those in which two levels or three 
levels are used for each of the k variables. These are called respectively 2° and 
3° factorials. If not all the factorial combinations are employed but merely 
a selected subset, we call the design an incomplete factorial. Any factorial or 
incomplete factorial we call a factorial-type design. 

A class of incomplete factorials of considerable interest are the fractional 
factorials of D. J. Finney [1] [2]. In these arrangements certain finite group 
properties are employed to select a (1/p)’ fraction of the complete design which 
then requires only p*~’ combinations of levels and may be called a p*~’ factorial. 
A useful and different class of incomplete factorials in which the selected subset 
is not restricted to be a (1/p)’ fraction is due to Plackett and Burman [3]. 

An infinite choice exists for the levels of quantitative variables such as tem- 
perature. In developing designs specifically for quantitative variables, there is 
therefore no essential need to restrict experimental conditions to combina- 
tions of a few basic levels of the component factors. Many useful designs have 
indeed been devised for the study of quantitative variables which do not employ 
the factorial principle [4] [5]. In spite of this, cases are not uncommon where 
even though the factors are all quantitative, convenience requires the use of 
only a few levels for each. 

In this paper we discuss a particular class of three-level incomplete factorials 
specifically selected for the study of quantitative variables. The class of designs 
is not included among the types of incomplete factorials already discussed but 
nevertheless appears to be of considerable practical importance. 


2.0. INcoMPLETE FACTORIALS FOR QUANTITATIVE VARIABLES 


When a design involving N runs is employed to separately estimate L con- 
stants we may define the ratio R = N/L as the redundancy factor for the design. 
This factor is necessarily not less than unity. 
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Suppose in what follows that the functional relationship between the response 
of interest and the levels of the k quantitative experimental variables may be 
graduated by a general polynomial of degree d in the levels of the variables. A 
design suitable for separately estimating the (k + d)!/k! d! constants of such 
a polynomial is called a design of order d. The highest degree of polynomial 
that may be fitted to the observations from a p-level factorial is p — 1. Con- 
sequently when regarded as a design for the fitting of a general polynomial the 
p’ factorial is a design of order p — 1. The redundancy factor for such a design 
is therefore p*k!(p — 1)!/(K + p — 1)!. When calculated in this way the re- 
dundancy factors for the complete factorials are usually large. For example, 
regarded as a first order design, the two-level factorial in five factors requires 
2° = 32 runs to estimate the 6 constants of the first degree polynomial. It there- 
fore has a redundancy factor of 32/6 = 5.3. Similarly, regarded as a second 
order design the three-level factorial in five factors requires 3° = 243 runs to 
estimate the 21 constants of the second degree polynomial. It therefore has a 
redundancy factor of 243/21 = 11.6. 

In situations in which the experimental error variance is not so large as to 
require large numbers of observations to obtain necessary precision, designs 
having small redundacy factors are desirable. Small redundacy factors may some- 
times be obtained by using incomplete rather than complete factorial designs. 
For example, if k = 3, 7, 11, 15, --- , 4¢ — 1 the two-level arrangements of 
Plackett and Burman provide first order designs requiring respectively only 
4, 8, 12, 16, --- , 4¢ runs, where 7 is a positive integer. They are thus first order 
two-level designs of redundancy unity. Designs having this minimal redun- 
dancy are seldom employed in practice because they provide no residual degrees 
of freedom and so do not allow the possibility of partially checking [6] [7] the 
adequacy of the assumed form of model. Other incomplete two-level factorial 
designs are available however having low redundancy factors of two or less 
which do not suffer from this deficiency. 

For the presently available three-level factorials the situation is less satis- 
factory than for the two-level designs. For example, the various one-ninth 
replicates of the 3° factorials all seem to lead to undesirable correlation or con- 
founding of estimates of the coefficients and although a one-third replicate of 
the 3° factorial may be employed as a second order design it has a redundancy 
factor of 81/21 = 3.9 which is somewhat high. 

In developing the present class of designs we do not use the group properties 
exploited by Finney; rather we set out directly to select part of the 3° factorial 
which allows efficient estimation of a second degree graduating polynomial. 
Specifically, we have where possible set out to generate second order rotatable 
designs. Arguments in favor of such 2 choice have been presented elsewhere [5]. 
Suppose we code the levels in standardized units so that the 3 values taken by 
each of the variables x, , 7, , --- 2 are —1, 0, and 1 and suppose also that the 
second degree graduating polynomial fitted by the method of least squares is 


k k k 
9 = bo + de bea + PY Dd diaries e 


t-1 j= 


A second order rotatable design is such that the variance of @ is constant for all 
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points equidistant from the center of the design—that is, for all points for which 
p = (>; 23)' is constant. Among the class of rotatable designs we select those 
for which the variance of 9, regarded as a function of p, is reasonably constant 
in the region of the k-space covered by the design. The requirement of rotatability 
is introduced to ensure a symmetric generation of information in the space of 
the variables defined and scaled in a manner currently thought most appropriate 
by the experimenter. For a design to be useful it need not have the property of 
rotatability exactly. For certain values of k, it turns out that within the class of 
designs we consider, rotatability can be achieved exactly; in other cases, exact 
rotatability is not possible and here, as described more fully in Appendix A, we 
relax the requirement to some extent. All the designs we discuss possess a high 
degree of orthogonality; in fact, only the constant term b, and the quadratic 
estimates b;; are correlated* one with another. 

The requirement of rotatability or near-rotatability imposes certain restric- 
tions [5] on the moments of the design. In Appendix A it is shown that when 
these restrictions are applied to variables which can take only the values —1, 0, 
and 1 certain simple combinatorial requirements emerge and that these require- 
ments can be satisfied by combining two-level factorial designs and incomplete 
block designs in a particular manner exemplified in the next section. 

The existence of the class of designs discussed here was suggested by the 
discovery in another connection [8] of a three-level rotatable design in seven 
variables which required only 56 points plus added points at the origin thus 
providing highly efficient estimates of the 36 constants in the polynomial of 
second degree. Further investigation led to the development of the present class 
of three-level designs utilizing the properties of incomplete blocks. 


3.0. METHOD For GENERATING THE DESIGNS 


The designs are formed by combining two-level factorial designs with incom- 
plete block designs in a particular manner. This is best illustrated by an example. 
In Table 1 is shown a balanced incomplete block design for testing k = 4 varieties 
in b = 6 blocks of size s = 2. 


TABLE 1 
A balanced incomplete block design for four varieties in six blocks. 
k = 4 varieties 


* Designs for which there is no correlation between either all or a subset of the quadratic 
coefficients can be obtained but they do not seem to possess any particular advantage [5] so 
far as estimating the response is concerned. 
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TABLE 2 
A 2? factorial design. 
Zs vj 


—1 -1 
| =] 
—i 1 
rey 


If this design were being used in the usual way, varieties 1 and 2 denoted by 
x, and x, would be tested in the first block, varieties 3 and 4 in the second, and 
so on. 

A basis for a three-level design in four variables is obtained by combining 
this incomplete block design with the 2” factorial of Table 2. The two asterisks 
in every row of the incomplete block design are replaced by the s = 2 columns 
of the two-level 2? design. Wherever an asterisk does not appear a column of 
zeros is inserted. The design is completed by the addition of a number of center 
points (0; 0, 0, 0), about three being desirable with this arrangement. The 
resulting design is shown in Table 3. As explained later, this design can in fact 
be run in three orthogonal blocks. These are indicated by dotted lines in the 
table. 

The design obtained is a rotatable second order design suitable for studying 
four variables in 27 trials and is capable of being blocked in three sets of nine 
trials. It is shown in Appendix B that this particular design is in fact a rotation 
of the corresponding central composite rotatable design [5] in four variables. 
It is however not generally true that the present class of designs can be generated 
from the central composite designs by rotation. 


TABLE 3 
An incomplete 3‘ factorial in three blocks of nine experimental runs. 
tM x 
—1 
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In Table 4 a number of designs of the class under study are given suitable for 
investigating 3, 4, 5, 6, 7, 9, 10, 11, 12, and 16 variables. In this table unless 
otherwise indicated the symbol (+1, +1, --- , +1) means that all combinations 
of plus and minus levels are to be run. Whenever a fractional factorial is available 
which does not confound main effects and two factor interactions one with 
another, it may be used instead of the full factorial. For example, in design 
No. 8, s is equal to five and as indicated in the table rather than using a full 2° 
factorial we can achieve the desired result with a half-replicate. 

Three members of the class of designs have been generated by other methods 
and have appeared elsewhere. Design No. 1 was first described by DeBaun 
[9], [10] and design No. 2 by Gardiner, Grandage and Hader [11]. The general 
method of Bose and Draper rederived design No. 2 in [12] and produced the 
points in designs No. 1 and No. 3 as identifiable subsets of rotatable designs 
in [13] and [12] respectively. 


4.0. BLocKING THE DESIGNS 


Where insufficient homogeneous experimental material is available for all 
the experimental runs it becomes desirable to run them in blocks. Where possible 
it is desirable to achieve orthogonal blocking, that is to arrange that the block 
constrasts are uncorrelated with all the estimates of the coefficients in the 
polynomial. When this can be achieved the analysis may be carried out almost 
as if block differences did not exist. The only modification necessary is that 
in the analysis of variance table the sum of squares associated with block dif- 


ferences must be substracted from the residual sum of squares. On the assumption 
that the model is adequate, the residual sum of squares so adjusted may then 
be used to estimate the within-block variance and hence the standard errors 
of the coefficients. 

The requirements for orthogonal blocking of second order designs have been 
given elsewhere [5]. Applying these results to the present problem, it is easy 
to see that: 


(1) Where “‘replicate sets’’ can be found in the generating incomplete block 
design these provide a basis for orthogonal blocking. These replicate 
sets are subgroups within which each variety is tested the same number 
of times. 

(2) Where the component factorial designs can be divided into blocks which 
only confound interactions of more than two factors these can provide a 
basis for orthogonal blocking. 


An illustration of the first method of blocking has already been given in the 
example of Section 3.0. In Table 4 dotted lines indicate the appropriate divisions 
into replicate sets. Using these divisions design No. 2 can be split into three 
blocks, design No. 3 into two blocks, design No. 6 into five blocks and design 
No. 10 into six blocks. In these and other blocking schemes discussed below, the 
center points must be distributed equally among blocks to retain orthogonality. 

The second method may be illustrated with design No. 4 for which the first 
method cannot be employed. The basis for the design consists of 48 trials gen- 
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Use 25—! fractionated 
No orthogonal blocking. 
BIB (one associate class) 


OD YLolsT475. 


(1 7); (2, 8); (3, 9); 
4, 10); (5, 11); (6, 12). 


2 blocks of 102 
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erated from six 2° factorial designs. If we were running a single 2° factorial design, 
it could be performed in two sets of four trials, confounding the three-factor 
interaction with blocks. Trials with levels (1, 1, 1), (1, —1, —1), (—1, —1, 1), 
(—1, 1, —1) would be included in one set (called the positive set) and trials 
with levels (—1, —1, —1), (—1, 1, 1), (1, —1, 1), (1, 1, —1) in the other (called 
the negative set). The complete group of 48 trials can be split into two orthogonal 
blocks of 24 by allocating one set (either positive or negative) from each of the 
2° factorial designs to one block, and the remainder to the other. 

This method is used where the block size s > 2 and employed for designs 
4, 5, 6, 7, 9, and 10 in Table 4. In designs 7, 9, and 10 the basic factorial is a 
2* design. This is split into two sets in such a way as to confound the four factor 
interaction, that is to say trials with levels whose product is positive are allocated 
to one group, and the remainder to the other. 

In some cases, both methods may be used simultaneously. Thus in design 6 
the basic incomplete block design contains five ‘replicates’ indicated by the 
dotted lines in the table, providing a basis for generating five blocks of 24 runs. 
Each one of these blocks may now be split into two by allocating the positive 
sets of the component factorials to one block and the negative sets to the other. 
We obtain finally an arrangement for generating ten blocks of twelve runs. A 
similar procedure may be applied in blocking design No. 10. 

While orthogonal blocking is desirable, since it minimizes the variance of 
the estimates of the regression coefficients, non-orthogonal blocking schemes 
may be employed without an excessive loss of precision when smaller block 
sizes than those given above are required. Such schemes will not be discussed 
in the present communication. 


5.0. IncLusIoNn or CENTER PoInts 


In addition to the runs generated directly from the 2° factorial design it is 
also necessary to include n, center points in order to avoid singularity in the 
moment matrix. The number of center points affects the variance profile, that 
is, the variance of 9 regarded as a function of the distance p = ~V =z; from 
the center of the design. The exact number of center points is not critical. The 
numbers given in the table are chosen so that the variance profile will be reason- 
ably uniform over the region of the experimental design and so that an even 
number of center points appear in each block. The variance profiles resulting 
from the designs here considered are shown in Figure 1 of Appendix A. 


6.0. ANALYSIS FOR THE DESIGNS 


In Tables 5a, 5b, and 5c, formulae and constants are given which are needed 
for the analysis of the designs of Table 4. The notation is explained below. 


6.1. Calculation of the estimates. 


In order to calculate the estimates by , b; , b;; , b;; , it is first necessary to 
write out the levels for each of the variables in the design and then to add 
further columns corresponding to 27 , 73, °** , Ze» Tike, UiT3, °** y Le-ike + 
This is done in Table 6 for design No. 2 where a set of typical data is also shown 
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TABLE 5a 
Estimates of the regression coefficients and their variances. 


bo = Ho 
b; A {ty} 
bj; = Bitty} + C, x {ity} + C2 x {lly} — @o/s) 


where p> and > refer to summation over first 


and second entedititin of i. 
by; = D,{ijy} 1, j first associates. 
b;; = D.f{ijy} t, j, second associates. 
7 
V(bo) No Cg 
V(b,) = Ao’ 
V(b;;) = [(B + 1/ s’nolo” 
V(b;;) = Do’ i, j first associates. 
= D,0 i, j second associates. 


Cov (bob;;) = onan o 


No 
Cov (b;:b;,) = lc, os ae Ale. 1, j first associates. 
1 2 Zen i 
=|c, o A. . 1, 7 second associates. 


NOTE: For BIB designs, all 7, j are considered first associates and C. = D. = 0. The con- 
stants A, B, etc. for the various designs are given in Table 5c. 


for illustration. The sum of products of the entries in the columns with the 
observations y are next calculated. In addition 9% the average value of the 
observations made at the center points is shown. The calculated quantities 
are next substituted in the formulae given in Table 5a to provide the required 
estimates using the constants of Table 5c. 

The following notation is employed: 


N N N 
{iy} = De Tie » {iiy} = Dy riade » {ijy} = De Piabinda « 


The grand total can be regarded as the sum of products between y and a dummy 
variable 2) which always takes the value 1 so that 


{Oy} = yw. 
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TasLeE 5b 
Formulae for the analysis of variance. 


Correction due to the mean: {Oy}?/N 
k 

Sum of squares due to linear terms: A > {iy}? 
i=1 


Sum of squares due to second degree terms: 


(a) Due to interaction terms: D, > {ijy}* + Dz bm {ijy}? 
<i i<i 


(b) Due to quadratic terms: bo{Oy} + = b; {ity} — {Oy}?/N 


Total sum of squares after Rs . 
correction for the mean: DX, ys — {0y}?/N 


u=1 


In the present example the sums of products are: 
fo = 90.6; { Oy} = 2319.4; 
{ ly} 23.2; { 2y} = — 23.5; { 3y} = 13.6; 
{lly} 1033.6; {22y} = 1010.3; {33y} = 1027.0; 
{12y} = — 6.7; {13y} = — 15.3; {l4y} = 3.8; 
(24y}'= — 105;. {S4y)= — 170; 


TABLE 5c 
Constants for the designs of Table 4. 


Center Redun- Non- 
Points dancy Sphericity 
A B Ci C2 8 No Factor Index I 


1/8 1/4 —1/16 0 
1/12 1/8 —1/48 0 
1/16 1/12  —1/96 0 
1/24 17/216  —10/216 —1/216 
1/24 1/16 1/144 0 
1/40 1/30 —1/120 —1/720 
1/64 17/512 1/512 —7/512 
1/80 1/48  —1/600 0 
1/64 23/1024 9/1024 —1/1024 
1/96 41/3072 —7/3072 —1/3072 
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Sample calculation for the four-factor design (No. 2). 


and from Table 5c for design No. 2 we have A 
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For example, 
bh = a5 (23.2) = 1.930 


i. ee A as 
bu = 5 (1083.6) — +; (4095.2) — 22 = -1.416 
in + (-6.7) = —1.675 


6.2. The Analysis of Variance. 


The analysis of the variance table is readily calculated using the relations 
of Table 5b as follows. 


Analysis of Variance Table 
8.8. d.f. m.s. 
Due to linear terms 268.36 4 67.09 
Due to second order terms 294.92 10 29.49 
Residual 126.71 12 10.56 


Total after eliminating the mean 689.99 26 


The observations recorded at the center point were 93.8, 87.3, and 90.7. Had 
there been no blocking of the design (that is if the runs had been made entirely 
in random order) these observations at the center point would have provided 
two degrees of freedom for estimating the error variance. Their sum of squares 
for deviations from their mean would have been 21.16 and the residual sum of 
squares could have been split into two parts, as follows 


8.8. d.f. 


Replicated center points 21.16 2 
Remainder 105.57 10 


126.71 12 


Residual { 


to provide a basis for a possible test of goodness of fit for the model. 

In this particular example, since the error sum of squares would have only 
two degrees of freedom, such a test would of course be very insensitive and 
provide no more than an indication that the remainder sum of squares was or 
was not of the right order of magnitude. Our main object here is to illustrate 
general principles. 


6.3. Elimination of Block Effects. 


The design illustrated was actually carried out in three blocks of nine observa- 
tions. Since the blocking is orthogonal the elimination of blocks will only affect 
the residual sum of squares. The block means g, , #2 and g; are respectively 
749.1/9, 750.6/9, 774.7/9 and the sum of squares associated with blocks is 


2 2 ® P 
ETE TE a ue = 105.53. 
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We cannot now isolate the two degrees of freedom for the differences among 
the center points and the analysis of variance is as follows. 


8.8. d.f. m.s, 
Due to linear terms 268.36 4 67.09 
Due to second order terms 294.92 10 29.492 


126.71{ 


Residual 
Blocks 


21.18 10 2.118 
105.53 2 52.765 


Total after elimination of mean 689.99 


It is seen that in this example a large proportion of the residual variance 
is accounted for by the blocks. On the assumption that our model is adequate, 
the mean square of 2.118 provides an estimate ¢’ of o’. This estimate will there- 
fore be employed in calculating the standard errors of the variance coefficients. 
If extra runs at the center point could be made then an equal number of these 
should be allocated to each block. The pooled variances for replications at the 
center point within each block would then provide an estimate of error approp- 
riate for testing the adequacy of the model. 


6.4. Variances, Covariances and Standard Errors. 


The variances and covariances of the various estimates are obtained from 
the formulae in Table 5a with an appropriate estimate é” of the experimental 
error variance replacing o” in those formulae. In the present example we employ 
the estimate ¢? = 2.118. Taking square roots of the estimated variances we 
obtain the following values for the standard errors of the estimates: 

2.118 


S.E(b) = 4/=3— = -84; 


S.E.(b) = Sa = 42; 


S.E.(b,) = 2.118. = 66; 


S.E.(b,,) = 2s = 73. 


6.5. General Comments on the Analysis. 


The simple type of analysis illustrated above is appropriate for designs 1, 2, 3, 
5, and 8. The analysis of designs 4, 6, 7, 9, and 10 is slightly more complicated. 
Estimates of by , the constant term, and the linear terms b; are obtained exactly 
as before. The multiplier D for calculating the interaction effects however 
takes two values for these designs. The multiplier D, is appropriate for these 
combinations of variables listed as first associates in Table 4 and D, for those 
combinations listed as second associates. In Table 4 combinations belonging 
to only one of the associate classes are listed. All others belong to the other 
associate class. For example, in design No. 4 the interactions 1 4; 2 5; and 3 6 
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are between first associates and take the multiplier D, . For design No. 7 how- 
ever it is more economical in space to list the second associates which take the 
multiplier D, . In calculating the estimate of b;; (Table 5a), C, is the multiplier 
of >-* {jjy} in which the j’s are first associates of ¢ while C, is the multiplier 
of >-** {lly} in which the /’s are second associates of i. 


AppENpDIXx A 
DERIVATION OF THE CLASS OF THREE LEVEL DESIGNS 


The requirements which need to be satisfied in order that a design shall be 
second order rotatable are given elsewhere [5]. It is desirable [5] when possible 
to satisfy the additional condition that biases due to neglected third order terms 
are zero. The conditions which the design points must then satisfy are as follows: 
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Bearing in mind that for our present purpose each x can take only the values 
—1, 0 or 1, we consider what is implied, first for single columns of the design, 
then for pairs of columns and so on. In what follows a coincidence means the 
occurrence of 1’s (plus or minus) in the same row of the design matrix. In general 
where we refer to the occurrence of a “1” we méan a + 1 ora — 1. The equation 
numbers refer to the appropriate relations above. 


(a) Single columns. The same number of 1’s occur in each column. Half 
of these are +1 and half —1 (Equations 1 and 2). 

(b) Two columns. The number of coincident 1’s is greater than zero and the 
same for all sets of two columns. For these coincident 1’s 


>. Ss = 0 and Dd iriv =0 
where, here and subsequently, the summation is taken over the relevant 
coincidences (Equations 3, 5 and 6). 
(c) Three columns. For the coincident 1’s occurring in any three columns 


ZL Lin = 0; > Liwliyn = 0; > Ciile = 0. 


(Equations 7, 8 and 9) 
(d) Four columns. For the coincident 1’s occurring in any four columns 


zZ LiL = 0; Zz. LivljLinkiy = 0. 
(Equations 10 and 11) 
(e) Five columns. For the coincident 1’s occurring in any five columns 


Dd LiLililites = 0. 


(Equation 12) 

Considering the possible designs we see from (b) that we cannot use any 
arrangement for which no coincidences occur. It is on the other hand possible, 
in principle, to generate designs in which 1’s are coincident only in pairs of 
columns. In this case requirements (c), (d), and (e) are automatically. satisfied. 
To satisfy requirement (b) consider the coincidence of 1’s in the ith and jth 
column. For these ones we require >) .2,;, = 0; >> 2;, = Oand >> z,,2;, = 0. 
The fewest number of coincidences for which this can be satisfied is four. The 
actual values of the coincident 1’s must then be-some permutation of the rows 
of the 2” arrangement: 


+] —J 
i -] 
=} 1 
1 1 


We now need to include these component arrangements so that Equation 
(4) is also satisfied. This requires that the number of coincidences in each 
pair of columns is one third the number of 1’s occurring in each column. The 
combinatorial properties required of the coincidences are seen to be exactly 
those of a balanced incomplete block design with r = 3u (where, in the incom- 
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plete block design, r is the number of times each treatment is replicated and x 
is the number of times each pair of treatments appear together in the same 
block). Precisely this method of construction is employed in design No. 2. 

Designs may also be obtained in which 1’s are coincident only in sets of three 
columns. Requirements (d) and (e) are automatically satisfied and requirement 
(c) can be met by arranging that the actual values of the coincident 1’s form 
the elements of a 2° factorial. By arranging once more that the coincidences 
follow those of a balanced incomplete block design with r = 3y all conditions 
are satisfied. Design No. 5 is an example of this type of arrangement. As has 
been shown [14], exactly similar arguments may be employed for designs with 
higher numbers of coincidences. Where coincidences of more than five columns 
are involved we could satisfy all the requirements with fractional factorials 
instead of full factorials for the basic units provided that the generators of the 
fractional factorials contain not less than six elements. 

Among the designs listed in Table 4, the above method of generation accounts 
for arrangement No. 2 for four variables in twenty-four runs and arrangement 
No. 5 for seven variables in fifty-six runs. Other arrangements of this kind are 
available, but only those giving low redundancy factors are listed here. Balanced 
incomplete block designs for which r = 3 and for which the redundancy factors 
are satisfactory are unfortunately not available for all k. To obtain useful 
designs for other values of k some relaxation in our requirements must be made. 
A natural modification is to employ balanced incomplete block designs for which 
r ~ 3y. It is easily seen that for such designs all the equations (1) through (12), 
excepting (4), will be satisfied. Instead the design will satisfy 

N N 

- Yaw. = Lat. 

MB ouH1 u=1 
The ratio r/u may be chosen to be as close to 3 as possible. Designs of this class 
in Table 4 are No. 1 (k = 3, r/p = 2), No. 3 (k = 5, r/u = 4), No. 8 (k = 11, 
r/u = 2.5). The resulting designs are not quite rotatable but, as has been pointed 
out already, the property of rotatability is desirable rather than critical and 
for the designs discussed the variance of 7 at points equidistant from the origin 
changes little. This is shown quantitatively in the last column of Table 5c 
which shows the non-sphericity factor ‘‘I’’ for the designs considered [14]. This 
non-sphericity factor measures the range of variance of 7 divided by its midrange 
on the unit sphere 
k 
,? 2 


x 


= 1. 


For rotatable designs the factor is zero. 

A further relaxation of the same kind is to allow the use of partially balanced 
incomplete block designs. Again all the conditions will be satisfied except those 
of equations (3) and (4). Instead of this relationship, we will have for these designs 
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where y, is the number of times first associate treatments appear together in 
the same block and yz, is the corresponding parameter for second associates. 
Once more these designs are nearly rotatable and have low redundancy factors. 
The values of I and R for these designs also are shown in Table 5c. Characteristic 
of this classification of designs is that the variances of interaction coefficients 
(b;;) are different depending upon whether 7 and j are first or second associates. 
In practice, as can be determined from the formulae and constants in Table 5a 
these differences in variance are not serious and the resulting designs are per- 
fectly satisfactory. In Table 4 designs No. 4 (for k = 6), No. 6 (for k = 9), 
No. 7 (for k = 10), No. 9 (for k = 12) and No. 10 (for k = 16) are of this type. 

Equation (12) of the moment conditions for three-level designs arises from 
the requirement that biases due to third order terms be made zero. The relaxa- 
tion of this condition would preserve all the properties of the design except 
that if, contrary to assumption, three-factor interaction coefficients were not 
zero, these would cause the two-factor interaction coefficients to be biased. 
Condition (12) is relaxed in design No. 8 in which a half-replicate of the basis 
2° design is employed. 

Figure ‘1 gives the variance profiles for the designs of Table 4. These graphs 
show the standardized variance function 


va) = vo 


plotted as a function of p = (>_, 23)! the distance from the center of the design. 
A number of center points have been added to make the variance at p = 0 
equal to the midrange variance at p = 1 and is close to the number recommended 
in Table 4. The small adjustment to n, required to distribute the center points 
equally among blocks has a negligible effect on these graphs. For non-rotatable 
designs the two curves indicate the maximum and minimum variance obtained 
[14] on a sphere of radius p. They thus represent the envelope of all possible 
variance functions that might be obtained by proceeding from the origin out 
along any arbitrary radius. 

Our object here is merely to present a set of designs whose properties are 
sufficiently desirable to justify immediate application, it is by no means implied 
that the designs we have listed are exhaustive. In particular, as will be reported 
elsewhere, the method of generation here used can provide designs in which the 
number of ones occurring in each row is not constant. Even within the particular 
class of designs which we have considered (in which the number of.ones in each 
row is constant), the designs presented are far from exhaustive. A wider but 
by no means complete selection of such designs is given in [14]. 


APPENDIX B 


In Section 3.0 the three-level 24-point arrangement is described which forms 
the basis, with added center points, for a second order rotatable design. As is 
mentioned in the text, this design is in fact a rotation of the four-variable central 
composite rotatable arrangement. This may be readily confirmed in the following 
way. 

Upon post-multiplying the matrix (excluding center points) for design No. 2 
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given in Table 3 by the‘orthogonal matrix 


1 1 


a Gente coe 
V2\9 0 1 


0 O , =) 


we obtain, except for the scale factor 1/ 4/2, the design matrix of the rotatable 
central composite*arrangement [5] which may in an obvious shorthand notation 
be denoted by 


+1 +1 +1 +! 

232 @ © 6G 
022 0 O 
0 0 232 O 
0 0 O +2). 
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Graphical Procedure for Fitting the Best Line 
to a Set of Points 


J. L. Dotsy* 


Lockheed Aircraft Corporation 


This report provides a simple graphical procedure for obtaining the slope and 
intercept of the straight line of best fit to a set of points in two dimensions. The solu- 
tion is obtained in the dual space (coordinate system) by use of mapping. Although 
this procedure is useful in itself for two dimensional problems, it may be even more 
useful as a teaching aid in illustrating some simple properties of mapping, dual 
spaces, the geometric meaning of an inverse, and the basic properties of curve fitting. 


The simplest curve fitting problem in two dimensions is the fitting of a straight 
line, y = a + bz, to two points with distinct abscissas. In Figure I the two 
points are plotted in the zy space, and a graphical solution is obtained with a 
straight edge. The subscripts on a and b are to indicate that the line L connects 
points P, and P, . While this solution is simple geometrically, it differs from the 
algebraic solution in that it does not use another space. This can be seen from 
the algebraic application as follows: After substituting the coordinates of P, 
and P, in the general equation of a straight line, the result is: 


yi = a+ ba, 
Y2 = a+ ba, 


These equations can be interpreted in a second space where the coordinates 
are b and a rather than x and y, as can be seen by stating (1) as: 


a=%M ob (—2,)b 
@ = Y2 + (—22)b 


If we now plot these two lines in the ba space, the coordinates of their inter- 
section will be the solution (b, a) of (1). Note that we have reversed the usual 
choice of axes. The reason for this will be apparent shortly. 

The significance of this second representation is that mathematically it is 
dual to the one shown in Figure I. In particular, the two points of Figure I 
correspond to the two lines of Figure II and the line of Figure I corresponds to 
the point of intersection in Figure II. In general any point in the zy space can 
be represented by a line in the ba space. Because of this duality a solution to a 
particular problem can be obtained in whichever space is more convenient. 

In the above have been given two methods for finding the straight line going 


(1) 


(2) 


* This paper was written while the author was a member of the General Electric Labo- 
ratory, General Electric Company, Schenectady, N. Y. 
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GRAPHICAL PROCEDURE: FOR FITTING THE BEST LINE 
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through two points; namely, the obvious one of using a straight edge and con- 
necting the two points in the zy space and the more difficult one of replotting 
the two points in the ba space as lines and reading the coordinates of their 
intersection. Assuming that we wish to work geometrically, it is clear that in 
this case we would use the zy space to obtain our solution. However, there are 
cases where it is more advantageous to use the ba space. 

To illustrate this let us add a third point in the xy space and the corresponding 
third line in the ba space as in Figure III. Suppose however, that the problem 
is to find the particular straight line that minimizes the error (in some sense) 
at the three points. Before considering the various ways of measuring the error, 
an arbitrary line in the zy space is considered (and the corresponding point 
in the ba space) so that the relationship between the error measurements in 
the two diagrams can be studied. In the zy space the distance from a particular 
point to the given line (indicated by the heavy line segments in Figure IV) is 
given very simply by the quantity ZH, = Y; — a — ba, . That this distance is iden- 
tical to the corresponding vertical distance in the ba space follows immediately. 
To be more specific: the distance from a point to a given line measured in a 
vertical direction in the ry space is identical to the distance from the correspond- 
ing line to the corresponding point also measured in the vertical direction in 
the ba space. 

This fact is now used to find the line that minimizes the maximum error at 
the given point. Such a line is known as the “best’’ line or sometimes as the 
Tchebycheff line. An attempt to find this line in the xy space causes certain 
difficulties. The error distances can be easily determined for any line. However, 
since they are located at different points in the diagram it is not immediately 





Solution: o#-1/4, b® 3/2 


obvious how a particular trial solution should be rotated or translated to reduce 
the maximum error. Although trial and error solution can be obtained geo- 
metrically it is desirable to have another means of accomplishing this. The ba 
space provides this means. A demonstration is made by examining the conse- 
quences of choosing a particular slope b and then finding the value of a that 
minimizes the maximum error for this choice of b. In Figure IV the choice of 
b = 1 is given by the heavy vertical line. For any choice of a on this line the 
error is given by the distance from the point determined by this choice of a to 
the lines on the diagram (measured vertically). Consequently the optimum 
choice of a is such as to place the point midway between the extremal lines 
(a = 1/2). If a is greater than 1/2, the distance to the lower line is increased 
whereas, if a is less than 1/2 the distance to the higher line is increased. The 
internal line (that is: the line or lines that for a particular value of b do not 
represent the maximum or minimum value of a) is not used in such a procedure 
and may be eliminated from consideration (as is done in Figure VI). The solution 
will then lie somewhere on the bisector of the two extremal lines indicated by 
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the dashed line in Figure V. From this it follows that the solution can be found 
immediately by choosing the value of b for which the extremal lines are closest 
together (b = 3/2) and the value of a(a = —1/4) which is midway between 
the extremal lines at that value of b. 

As an example, let us apply this procedure to find the line of best fit (in the 
Tchebycheff sense) to the logrithmic curve. The original function is y = log z 
which provides the table of data given in Figure VIb. To these data we now 
fit the straight line y = a + bz. Our solution is found by plotting the nine 
straight lines a = y + (—z)b shown in Figure VIb, determining the extremal 
boundary of these lines and we observe the solution to be b = .116 anda = .011. 
This line has in turn been plotted in the zy space of Figure VIa with the maximum 
errors again indicated heavily. 

This procedure is relatively quick in that it only requires the plotting of as 
many lines in the ba space as there are points in the zy space. The solution is 
obvious on inspection. A similar solution can be derived for the straight line 
that minimizes the sum of the ab values of the errors though in this case the 
internal lines do play an important part in the proceedings. However, this 
example may be even more useful from the pedagalogical point of view for the 
light it may shed on the use of mapping in curve fitting work. 
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Tables of Tolerance-Limit Factors 
for Normal Distributions* 


ALFRED WEISSBERG ** AND GLENN H. BEATTY 


Battelle Memorial Institute 


Tables of factors for use in computing two-sided tolerance limits are presented. 
In contrast to previous tabulations of the tolerance-limit factor K, we tabulate the 
factors r(N, P) and u(f, 7), whose product is equal to K. This results in greatly in- 
creased compactness and flexibility. The mathematical development is discussed, 
including methods used to compute the tabulated values and a study of the accuracy 
of the basic approximation. A number of possible applications are discussed and 
examples given. 


INTRODUCTION 


It is often desirable to predict the behavior of a variable quantity in such a 
way that at least a given percentage of future measurements can be expected to 
lie within a computed interval with a specified degree of certainty. For example, 
situations of this sort occur in the control of manufacturing processes and in 
predictions of parameter variation of electronic components. The values that 
specify the computed interval are called tolerance limits, the minimum fraction 
of the population which the limits are intended to include is denoted here by P, 
and the degree of certainty is referred to as the confidence coefficient*** and 
designated here as +. 

When the mean yz and standard deviation o of the population are unknown, 
the tolerance limits must be computed on the basis of a sample estimate Z of 
the mean and an estimate s of the standard deviation. The tolerance limits 
treated in the present work have the form + Ks, where the factor K accounts 
for the sampling errors in ~ and s as well as for the population fraction P. The 
method of computing K is covered in the section ‘(Mathematical Development”. 

The tables presented here are designed principally for use in computing 
two-sided symmetric tolerance limits about means of samples drawn at random 
from normal distributions. The tabulated values have no application in the 
computation of tolerance limits for nonnormal or nonparametric distributions. 
The section “‘Applications” indicates how the tables may be used in the computa- 


* This article contains the modified text and condensed tables from a Battelle Memorial 
Institute Publication. Individual copies of the complete table may be obtained without cost 
from the Publications Office, Battelle Memorial Institute, Columbus, 1, Ohio. 

** Now with the U. S. Food and Drug Administration, Washington, D. C. 

*** A conclusion (based on a random sample from a specified population) stated with confi- 


dence y means that, in an infinite sequence of samples, a fraction y of such conclusions would 
be true. 
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tion of one-sided tolerance limits for a normal population. However, it should 
be noted that one side of a two-sided tolerance limit is not a one-sided tolerance 
limit. For discussions of various types of tolerance limits, the reader is referred 
to References (1), (2), (3), and (4). 

We tabulate factors r and u, whose product equals K. Compact, extensive 
coverage and flexibility of use are the result. If K itself were tabulated, more 
than 1660 times as many entries would be needed to produce the same coverage. 
The examples in the section ‘Applications’ testify to the flexibility of the tables. 


MATHEMATICAL DEVELOPMENT 
Wald-W olfowitz Approximation 


The problem of estimating two-sided tolerance limits for the case of random 
sampling from a normally distributed population was studied by Wald and 
Wolfowitz (5). They consider only the case in which a sample of size N furnishes 
an estimate ¢ of the population mean u of the form 


1 
f=, Le 


and an unbiased estimate s of the true population standard deviation o of the 
form 


Dies ania ua 

ines DT et Zz (c — 4)’, 

where the summations run over all N sample values of the random variable z. 
They show that, for preassigned values of P and y, symmetrical two-sided 
tolerance limits can be approximated by ¢ + Ks, where K = ru. The factor r 
is defined by 


1 Nt/t4r 


12/2 as 
Ria e dti=P 


and u is given by 


f 


Uu = = ’ 
Xfev 


where xj,, is defined by 
Pr (x¢ > Xt0) = 


and f is the number of degrees of freedom associated with s. Should tolerance 
intervals be computed for each of an infinite sequence of samples from a normal 
universe, a proportion ¥ of these intervals would each contain at least the fraction 
P of the universe. 

Wallis (6) investigated some applications of the above approximation. He 
points out that, without assuming any connection between N and f, the Wald- 
Wolfowitz derivation of tolerance factors may be carried through with negligible 
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alterations. Furthermore, the number N does not have to represent the sample 
size. Although Wald and Wolfowitz treat only the case in which a random sample 
of N observations is drawn from a single normal population and f = N — 1, 
their results can be readily generalized to any normally distributed variate 
for which there is a normally distributed estimate of the mean with variance 
o /N and an estimate of the variance independently distributed as o’x’/f for 
f degrees of freedom, where f is not necessarily N — 1. In this usage, N is called 
the effective number of observations*. It largely determines the accuracy of 
the Wald-Wolfowitz approximation, as shown later. 


Computation of r(N, P) 


Two methods for obtaining accurate values of r(N, P) are the trial-and-error 
method and Newton’s method. In the trial-and-error method, a series of esti- 
mated values of r is put in the expression 


1 N-*/24p 1 —-N-*/24r7 


—t7/2 ~19/2 a 
\ 2r -N-1/3-,7 2 ” + V 2 N-1/3-,7 . dt aig 2P 


‘ and P is evaluated from tables of the normal distribution (7). Nearly constant 


first differences of the tabulated integral greatly simplify the computation. 
Newton’s method applied to the function 


1 N-1/245 


al —t3/2 — on 
f(r) 4 / Be ake é dt P 0 


requires that 
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a ee f(r) 


be evaluated for 7 = 1, 2, --- to obtain a convergent sequence r, , 7; , --+ that 
is terminated when the desired accuracy is obtained. The fact that the normal 
distribution tables (7) used present the ordinate and integral of the normal 
density function side by side makes Newton’s method preferable to the trial- 
and-error method. 

As r(N) is approximately linear on a reciprocal scale, only a scattering of 
r-values for each of the six levels of P (.50, .75, .90, .95, .99, .999) had to be 
computed by the above methods. This basic set of values then was used to supply 
end points for straight lines drawn on reciprocal graph paper (KE 359-25L) 
from which the remainder of the values were obtained. Several graphs were 
used for each value of P, with end points (computed to an accuracy beyond 
that of the graphical scale) obtained from the basic set of values for VN = 10, 20, 
40, 50, 80, 100, 200, 500, 1000, and ~. Proper algebraic magnification of the 
graphical scales enabled r-values to be read to four decimals. Occasional numerical 
computations of r were made in order to monitor the accuracy of the graphical 
interpolations. For N < 10, r was computed numerically. For each level of P, 


* The effective number of observations for a certain statistic is that number which, when 
divided into the variance of an observation, gives the variance of the statistic. 
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TABLE OF r(N, P) 


P 
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75 
1.6859 
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3.3344 
3.3333 
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N = Number of measurements used to obtain # (the sample estimate of the population mean). 
P = Proportion of population included between limits. 
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P 
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7 
2 


2.5900 


2.5898 
2.5897 
2.5896 
2.5894 
2.5893 


2.5891 
2.5890 
2.5889 
é d ; 2.5887 
1561 1.6531 : 2.5886 


N = Number of measurements used to obtain 2 (the sample estimate of the population mean. 
P = Proportion of population included between limits. 
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P 


3.3060 
3.3053 
3.3047 
3.3041 
3.3035 


3.3030 
3.3026 
3.3022 
3.3018 
3.3014 


3.3010 
3.3007 
3.3004 
3.3001 
3.2998 


3.2996 
3.2993 
3.2991 
3.2989 
3.2987 


3.2983 
3.2980 
3.2976 
3.2973 
3.2971 


3.2968 
3.2966 
3.2964 
3.2962 
3.2960 


3.2956 
3.2953 
3.2951 
3.2948 
3.2946 


3.2944 
3.2943 
3.2941 
3.2940 
3.2938 


3.2933 
3.2929 
3.2926 
3.2923 
3.2922 


2.5765 3.2914 
2.5763 . 
2.5761 

2.5760 

2.5758 


N = Number of measurements used to obtain Z (the sample estimate of the population mean). 
P = Proportion of population included between limits. 
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TABLES OF TOLERANCE-LIMIT FACTORS 
the values of r(N, P) tabulated are for 
N = 1(1)300(10)1000(1000)10000, ~*. 


Final checks and adjustments were made by differencing all values of r(N, P) 
with respect to N. We estimate that approximately 90 per cent of the tabulated 
values are correct to four decimal places and that the remainder are off by one 
unit in the fourth decimal. 


Computation of u(f, v) 


Obtaining accurate values of xj,, for all combinations of f and y which are 
presented in the table of u(f, 7) was not easy. No single source was adequate 
for this purpose. The following tabulation summarizes the way in which this 
problem was handled. Numbers in the body of the tabulation correspond to 
the reference list. 


f 50 ; d 


1 to 30 (8) (8) (8) 
31 to 100 (9) (10) (9) 
101 to © (12) (12) (12) 


Percentage points of the chi-square distribution to six significant figures 
were taken from Thompson (8) for f = 1(1)30(10)100 for y = .50, .75, .90, .95, 
and .99. 

Percentage points of the chi-square distribution to three decimal places were 
taken from Hald and Sinkbaek (9) (There appears to be a misprint in Hald and 
Sinkbaek’s table. The value x° = 96.344 for f = 97, y = 0.50 ought to read 
96.334. The latter value was used in our calculations) for f = 31(1)100 for 
vy = .50, .90, .95, and .99 and f = 1(1)100 for y = .999. With few exceptions, 
four-decimal accuracy in u was achieved because of the gain in accuracy from 
the square-root operation. Where the two tables overlapped, values of f/x’ 
computed from Hald and Sinkbaek were compared with values of f/x” computed 
from Thompson, with good agreement. 

Since Hald and Sinkbaek’s table does not include y = .75, it was necessary, 
for that case, to use Pearson’s Tables of the Incomplete Gamma-Function (10) 
to obtain four-decimal values of chi-square, by Aitken’s five-point method of 
inverse interpolation, for f = 32(2)102. From the values thus obtained, four- 
decimal values of chi-square for f = 31(2)99 were computed by use of Lagrangian 
five-point interpolation coefficients (11). 

Four-decimal values of chi-square were obtained for all values of y for f > 100 
by the Cornish-Fisher approximation (12) 


ae SE + COP + G(X) + BED 4 SD 5 GAD, 


f f 


* The numbers in parentheses give the argument interval. Thus, 1(1)300 says that tabular 
entries are provided for argument values 1, 2, 3, --* , 300. 
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jf = Number of degrees of freedom associated with s (sample estimate of population standard 
deviation). 


v = Confidence coefficient associated with limits. 
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where X is a normal variate associated with a cumulative probability p such 
that the following relationships hold: 


y=1-p 
Xi., = —X, 
G(—X) = (-1)'G(X), for t=1,--- ,5 
G(X) = V2x 
G(X) = #(X* — 1) 
1 


G(X) = 9V3 (X" — 7X) 


a 

405 
fa: ee 
4860 +/2 


The values of G;(X) used were those tabulated in the reference (six decimals). 
For each level of y, the values of u(f, y) tabulated are for 


f = 1(1)1000(1000)10000, ~. 


All calculations of uw were performed in the Battelle Digital Computing 
Laboratory with the aid of an IBM 650 Electronic Computing Machine. Results 
were checked by differencing. The table of u(f, y) was printed on an IBM 407 
Tabulator, using output cards from the machine calculations. We believe there 
are no errors caused by inaccurate calculations or transcribing results. Uncer- 
tainties in our sources of chi-square make it impossible to guarantee all of our 
u-values. However, we estimate that roughly 90 per cent are correct in the 
fourth decimal place and most of the remainder are off by one unit in the fourth 
decimal place. The accuracy for y = .999, f < 20, cannot be guaranteed. 


G(X) = (6X* + 14X” — 32) 


G,(X) = (9X® + 256X* — 438X) 


Comparison of Results with Bowker’s Table 


Bowker’s table (1) of two-sided tolerance-limit factors, although restricted 
to the special case of f = N — 1, provided a valuable check on our calculations. 
In his table, three-decimal K-factors are given for all combinations of P = .75, 
90, .95, .99, .999; y = .75, .90, .95, .99; and 


N = 2(1)102(2)180(5)300(10)400(25)750(50)1000, ~. 


Wherever possible, our values of r and u were multiplied together for comparison 
with Bowker’s K-factors (In doing this check, an error was discovered in Bowker’s 
table. Apparently, an incorrect value of chi-square was used for y = .95, f = 38, 
which affected the K-factors for y = .95, N = 38 and 39. However, the dis- 
crepancies are small and not likely to be important in practical applications.) 
There was an occasional difference of one unit in the third decimal place that 


could be accounted for in the rounding procedures. Otherwise, the check was 
excellent. 
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Although a table of two-sided tolerance limits by Lieberman (13) corresponds 
more closely in principle to ours than does Bowker’s table, it was not broad 
enough in scope to be used effectively as a check on our results. However, it 
was Lieberman’s table that brought our attention to the question of tolerance 
limits for situations where f ~ N — 1. 


Accuracy of Basic Formu'a 


As implied earlier, the tolerance-limit interval ¢ + rus is not the correct 
interval for N, P, f, and y, as intended, but is only an approximation to the 
correct interval. The interval + rus really corresponds to N, P, f, and ’, 
where the true confidence coefficient y’ differs from the intended confidence 
coefficient y by an unknown amount. This stems from the fact that, in order to 
obtain a formula amenable to numerical calculations, Wald and Wolfowitz 
were forced to the approximation 


_ att. 
¢ | VW/N 
in the derivation of r(N, P). Here, 1/-/N is the expected value of |(@ — u)/o| 
for samples of size N drawn at random from a normal population with mean 
mw and standard deviation o. In their theoretical development, Wald and 
Wolfowitz include a method for computing upper and lower limits (denoted 
here by 7 and y, respectively) for the correct confidence coefficient y’ for the 
tolerance-limit interval ¢ + r(N, P)u(f, y)s. The method provides a useful 
check on the accuracy of the basic formula, but, because the computations 
are laborious, only two pairs of 7 and y were calculated. These, together with 
other results from Reference (5), are given in the following tabulation along 
with 1/N* which, as Wald and Wolfowitz show, represents |y — 7’| within an 
order of magnitude. 


1/N? 


- 25000 
.01235 
.01235 
.00160 
.00160 
.00015 


In the foregoing discussion, the problem was regarded from the viewpoint 
of how 7 differs from the correct value 7’ for a given K. This approach simplified 
the mathematical development. However, the accuracy of the basic formula 
may be required in terms of how the approximation ¢ + Ks differs from the 
correct tolerance-limit interval ¢ + K’s. There is more than one way of doing 
this, but the easiest way appears to be to use the 1/N® approximation and 
evaluate the rate of change of K with respect to y for particular values of vy, 
holding N, P, and f constant. (The Wald-Wolfowitz formulas for 7 and y suggest 
that it is theoretically possible to obtain exact tolerance limits by computing a 
sequence of K-values that converge on the correct value K’ for a given y. Although 
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the computations would be extremely lengthy, even to obtain one tolerance inter- 
val, they might be feasible with the aid of high-speed electronic equipment.) The 
following tabulation presents the estimated number of significant decimal places 
in the product ru for selected combinations of N, P, f, and y. 


N 10 50 


¥ Pe .95  .999 75 95 


75 2 2 2 3 
95 2 
.999 1 


75 2 2 t 
.95 2 3 
.999 1 2 


75 3 2 2 4 
.95 2 2 2 3 
.999 1 1 1 2 


(a) Computation of tolerance limits not recommended. 


Although the figures in the above tabulation must be regarded as crude estimates, 
they should prevent overconfidence in the tables of r(N, P) and u(f, y) and 
indicate the relative importance of N, P, f, and y. It appears that both N and 
y have a strong effect on the approximation relative.to P and f and that the 
size of N is the predominant consideration. It can be concluded that the Wald- 
Wolfowitz approximation is excellent for reasonably large samples but generally 
not for small samples (particularly for N < 10). 


Applications 
The tables of factors presented here reduce the calculation of tolerance limits 


to a simple procedure. From a random sample drawn from an appropriate 
population, the following four quantities are determined: 


= sample estimate of population mean 

= effective sample size associated with z 

= unbiased sample estimate of population standard deviation 
= number of degrees of freedom associated with s 


In addition, it is necessary to specify the two fractions: 


P = proportion of population included between limits 
y = confidence coefficient associated with limits 


The quantities N, P, f, and y are used to find the tolerance-limit factors 


r = r(N, P) and u = u(f, y) from the tables. The endpoints of the tolerance 
interval are given by the formulas: 


# + rus = upper tolerance limit 


& — rus = lower tolerance limit 
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Univariate Analysis 


In the case where the sample x, , 22 , --+ , %, has been randomly drawn from 
a single univariate normal distribution, all the observations are used in com- 
puting both Z and s. Consequently, N = n and f = N — 1, and the calculation 
of tolerance limits in this common situation proceeds as indicated above. 

In addition to this application, the tables of r(N, P) and u(f, y) can be used 
in conjunction with various, more advanced, statistical techniques, including 
those discussed in the remainder of this section. 


Analysis of Variance 


This example deals with the reliability of electronic components for an air- 
borne missile-guidance system subject to extreme physical environments induced 
by high-speed flight. (Research on this Battelle project motivated the construc- 
tion of these tolerance-limit tables.) The success of a missile-guidance system 
depends on the correct functioning of many electronic components. The param- 
eters of these components are subject to “drift”. The computation of tolerance 
limits provides quantitative information that enables the engineer to design 
each electrical circuit to tolerate these drifts. Each component used to make 
up a circuit must operate within certain limits or the circuit may malfunction 
and the result may be a failure of the whole system. The form of analysis in 
this example allows the designer some latitude in choosing components for each 
requirement. Furthermore, because of the generality of its interpretation, the 
factorial-type experiment yields information useful in future design problems. 

In one phase of this program, the effects of ambient temperature and electrical 
load (power dissipation) on wire-wound resistors were studied by a series of 
factorial experiments. Each experiment consisted of 45 observations on resistance 
change (in per cent of initial resistance) for every combination of three tempera- 
tures (25, 100, and 125 degrees Centigrade) and three loads (50, 100, and 150 
per cent of rated wattage), for a given resistor type and load-life. Investigation 
confirmed the assumption that for the various test conditions the distributions 
of the dependent variable (resistance change) were normal and had a common 
variance. 

The following typical results demonstrate the analysis of variance used to 
interpret the data from each experiment. 


Source of Degrees of Mean Variance 
Variation Freedom Square Ratio 
Temperature 2 0.00002745 
Load 2 0.00000635 
Interaction 4 0.00000231 
Residual 388 0.00000311 


The degrees of freedom total 396 instead of 404, because missing values reduced 
the number of degrees of freedom for the residual by 8. Statistically, temperature 
is very highly significant, since the ratio of its mean square to the interaction 
mean square exceeds the tabulated value 7.03 corresponding to the .001 risk 
level for 2 and 388 degrees of freedom. It can be concluded that temperature 
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really affects resistance. Load is not significant by ordinary standards, since 
the ratio of its mean square to the interaction mean square is less than the 
tabulated value 3.02 corresponding to the .05 risk level for 2 amd 388 degrees 
of freedom. However, it might be argued that additional testing would verify 
the existence of a load effect. Interaction of temperature and load is definitely 
not significant. 

These results justify a separate analysis for each test condition. For the 45 
resistors tested at 25 degrees Centigrade and a load of 100 per cent, the mean 
change in resistance is 0.17 per cent. An estimate of the standard deviation is 
furnished by the square root of the residual mean square. A tolerance interval, 
such that the probability is 95 per cent that at least 99 per cent of all resistors 
(of the type tested) in similar applications will perform within that interval, 
is determined by the quantities: 


& = 0.17% s = 0.1764% 
N = 45 f = 388 
P= .99 vy = .95 
Two-sided tolerance limits are computed as follows: 
& + rus = 0.17% + (2.6039)(1.0630)(0.1764%) = 0.66% 
& — rus = 0.17% — (2.6039)(1.0630)(0.1764%) = —0.32% 


Consequently, the design engineer can be 95 per cent certain that, if the system 
operates under the conditions simulated in these tests, no more than one per 
cent of these resistors will drift outside the interval from —0.32 per cent to 
+0.66 per cent of their initial resistance. Other limits apply to other environ- 
mental conditions. 


Regression Analysis 


The application of the theory of tolerance intervals to problems involving re- 
gression analysis is illustrated by the following hypothetical example. 

The sample (x, , yi), (%2 , Y2), -** » (4a » Yn) has been randomly drawn from a 
bivariate population for which y is a normally distributed random variable 
whose variance is o° and whose mean is a linear function of x. It is desired to 
compute a tolerance region such that, if x is specified, limits are determined 
that will cover at least 90 per cent of the population of y-values with a certainty 
of 95 per cent. The computation is illustrated with the following data: 


n = 120 > « = 3444 > y = 1135 
> zy = 32665 D2? = 99605 > x’ = 11103 


The estimated mean Y of y for any value of z is given by the regression line 
Y = a+ bz, the constants of which are obtained as follows: 


- 2G =P nicy— Diz Diy _ 
>> (« — # n>. x —(>>2) wee 


=g—b@ =(YKy— db Daxam= 6.04. 
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The standard error of estimate s, which estimates c, is given by 
ma w- Y). ny — (Ly? — bn Vw- Yzdy _ 
ane aoe 3.0256, 
8 = 1.74. 


To calculate tolerance intervals (6), we note that the variance of Y for any 
value of z is given by the Working-Hotelling formula (14) 


aj 1 (x — @)" ] 
We 4 > (« — #° J’ 
whence the effective number of observations for a given z is 
n 2 (x — 2)” nde —(> x)” 
ea + n(t — #) 2 + ant — 2 >> 2) 


That is, for any value of x, the mean value of y is determined as accurately from 
the regression line as if N observations had been made at that value of x. The 
procedure is illustrated for « = 31: 


N = 65.5 f=n—2=118 
P= .90 y = .95 
r= 1.6574 u = 1.1209 
The mean of y corresponding to x = 31 is 
Y = 6.04 + 0.1192 = 9.73. 
Tolerance limits for y corresponding to z = 31 are 
Y — rus = 6.50, Y +rus = 12.96. 


Similar computations for other values of x supply points which determine 
tolerance-limit curves about the regression line, as shown in Figure 1. 


Figure 1—Two-sided tolerance-limit curves about regression line. 
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One-Sided Tolerance Limits 


A collection of tables of one-sided tolerance-limit factors for normal distribu- 
tions has been published by Owen (15) for the case f = N — 1. In Appendix 
A of his tables, he points out that two-sided limits can be converted to approxi- 
mate one-sided limits as follows: 


upper one-sided limit = ¢ + (1.05)rus 
lower one-sided limit = # — (1.05)rus 


where #, r, u, and s are defined the same as for two-sided limits except that a 
fraction (P + 1)/2 of the population is expected to be covered by the one-sided 
interval as opposed to a fraction P of the population for the two-sided interval. 
Table IV of Owen’s collection contains the results of applying the above con- 
version to Bowker’s table of two-sided tolerance limits. We have not considered 
the application of this conversion to cases other than when f = N — 1. 


Computer Programming 


The tables of r(N, P) and u(f, vy) presented here should be adequate for nearly 
all applications of tolerance limits in hand calculations. When a great many 
tolerance-limit intervals are needed, or when the need for them occurs in a 
large-scale data-processing job, it may be desirable to program the calculation 
of tolerance limits for an automatic computer. The basic formulas do not lend 
themselves well to this operation, even in the form of the Wald-Wolfowitz 
approximation. However, the present tables are suitable for direct inclusion in 
programs for high-speed computers with the usual table look-up feature. In 
many cases, the calculation of all tolerance limits can be handled in this way. 
The separate tabulation of r and u allows many more combinations of N, P, f, 
and to be represented than would be possible if only the product ru = K were 
tabulated. Furthermore, it may be necessary to use only selected portions of 
the tables. Usually only one pair of values for P and y are needed in a particular 
application. The values of N and f are usually limited to certain ranges, as well. 
If necessary, wider intervals can be used for these arguments, although this 
might introduce the need for an interpolation routine. 

If word-storage restrictions prevent the complete adoption of table look-up, 
it might still be used partially. The formula for r is difficult to program, but 
an approximation of the Cornish-Fisher type could be programmed, if there 
were sufficient need, to furnish satisfactory u-values for f large. 


Related Calculations 


In addition to the construction of tolerance limits, tables of r(N, P) and u(f, 7) 
have potential applications to other problems. For instance, the table of r(N, P) 
might be useful in setting confidence limits where symmetry about the center of 
the normal distribution is not desired, and the table of u(f, y) could be used to 
obtain chi-square fractiles that are not available readily. 
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TECHNOMETRICS 


On the Evaluation of the Negative Binomial 
Distribution with Examples 


G. P. Patin 


Indian Statistical Institute and The University of Michigan 


The Negative Binomial Distribution is a frequently encountered standard dis- 
crete distribution function. By the use of the available Binomial or Incomplete Beta 
Function tables it can be evaluated easily through the use of certain identities. 


1. INTRODUCTION 


Different approaches are possible with regard to the basic structure of the 
standard discrete distributions like the binomial, Poisson, negative binomial, 
or logarithmic series. Under plausible circumstances, these distributions may 
be regarded as descriptive models of populations. For instance, the number of 
accidents met with, over a period of time by a particular individual, can ordinarily 
be assumed to have a Poisson distribution. However, different persons may have 
different accident proneness as measured by the average number of accidents 
to the individual. If this average has, say, a Pearson’s type III distribution, 
the distribution of the number of accidents pooled over the individuals can be 
shown to follow the negative binomial law. 

Again, the standard discrete distributions may arise as a result of the sampling 
scheme adopted. In sampling with replacement n items from a lot of manu- 
factured items, the number of defectives follows the binomial law. On the other 
hand, if one uses what is known as the inverse binomial sampling procedure 
(that is, one goes on sampling with replacement until he get a fixed number 
such as k of defectives), the number of items sampled follows the negative 
binomial distribution. 


2. SUMMARY 


u(e,p,) = (#2 Nora — py 


whereO< p<10<k<o,2=0,1,2,---. 
In order to evaluate the negative binomial distribution function: 


¥¢, 7,8) = Yule, 7, ( 


we can use (positive) binomial distribution function tables, when k is a positive 
integer, or tables of the incomplete beta function for any k. Thus no special 
tables are needed, since: 
Y¢,p,4) =1—Bk-—1,7p,r+h), k=1,2,3---. (3) 
501 


Let 
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Be, p,n) = dX ("pra “a 
Yr,p,4) =1,(k,r+1), O<k<o 


ns 1 . m—1 au n—-1 - 
I,(m, n) = iin = u™ (1 — u)* du 
(the incomplete beta function) 


3. PRoors 
The proof of (3) is as follows: 


v¢,p,h = O(*+2— pa —p 


= cn that at most k + r independent trials are required to 
get k successes when p is the probability of success at each trial. 


= Probability that at least k successes occur in k + r independent 
trials when p is the probability of success at each trial. 


ai 2 (1 L- a 


2=k 
1— Bk —-—1,p,r+h). 


The proof of (3) given above is based upon probability argument. The identity 
is essentially algebraic and hence it appears to be interesting to establish it 
algebraically. The following alternative algebraic proof of (3) is due to Professor 
C. C. Craig. 

The identity to be cheblideed4 is 


. per +4- ed a = + fs "ra _ pth 
“Ctra 


Divide by p* and then set 1 — p = u. The identity becomes 


ee +2- Ne = > a 2 Na ~ Wu 


In the right hand side of (7) the coefficient of u’(0 < s < r) is 


-0'[() - (rete te em 


rer a(t) 4 u(t) 
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This is the coefficient of u* in 
(—1)(1 es u)’**(1 eu eo cs (1 ae u)***-*(—1)° 


But the coefficient of u’ in the left hand side of (7) is e + ‘ Henee the 


identity. 
Also we have 


I(k,r+1) = u—*(1 — u)" du 


mera 


d k=1 r 
dp Uolk, 7 +1] = ay (1 — p) 


ae 


= Fue, pk +1) 


Next, 


t— tip" i p)” a xp*(1 tei p)**] 


ide 
Sete 
“ats 


- [Y¢, p, B)] 


‘pt TKL — py + 2 — p— =p 


= Ned + a) — py — 20 — 
entity > (k + apts Tee \p “1 — p)° 
lish it : x ‘ 
fess on TT k-1/4 _ \z—1 
yressor 3 2( y )p (1 p) 


= HVC, pk + D)- SVG — Lk + DI 


ay", p,k + 1) 


d d 
dp YO, pK) = Go Uther + 1] 


dp ! 
integrating (10) with respect to p, we have 
Yr,p,4) = L(k, r+) +C 


where C is a constant of integration. 
Now at p = 0, Y(r, 0, k) = 0 = I,(k, r + 1) 


“. C = 0 and hence the statement (5). 





504 G. P. PATIL 


4, APPLICATIONS 


1. The results obtained above can be used to compute with ease the expected 
frequencies when one is trying to fit the negative binomial distribution. After 
the parameters p and k are suitably estimated, all that one need do is to refer 
to the Incomplete Beta Function tables (or to the Binomial Distribution tables) 
for evaluating the expected cumulative frequencies and hence the expected 
frequencies. 

2. The results can be utilized in solving acceptance sampling problem of 
the following type. Suppose that an acceptance sampling plan calls for drawing 
(with replacement) units from a lot until one get 5 defectives. If 50 drawings 
or less are required, one rejects the lot. If more than 50 drawings are required, 
one accepts the lot. What is the probability of accepting a lot that is 10 per 
cent defective? 

Writing k = 5, p = .10 and N = 50, we have the probability of acceptance 


given by 
P= > C ce Ye — p)** 
s=N+1 —1 


w1- (+8 > Ya oy 


z=0 
= 1— YN — k,p,k) 


By (3), P. = B(k -- 1, p, N) = B(4, .10, 50). From Binomial tables, one has 
P, = 0.431199. More generally, we have the following. 

3. A single-sample binomial sampling plan is characterized by the sample 
size » that should be taken and the acceptance number c of defective units 
that cannot be exceeded without the lot’s being rejected. The specifications of 
the plan are taken to be the producer’s risk a at the acceptable quality level 
(AQL) and the consumer’s risk 8 at the lot-tolerance-fraction-defective (LTPD). 
These specifications dictate the particular choice of the parameters n and c 
of this plan determined by the two equations: 


Be, AQL,n) = 1l1—a (12) 
Be, LTPD,n) = 8 (13) 


A single-sample inverse binomial sampling plan is characterized by N and 
k where N is the smallest number of drawings required for accepting the lot 
when one keeps drawing until one gets k defectives. The specifications of the plan 
are taken to be the same as in binomial sampling plan. These specifications 
dictate the particular choice of the parameters N and k of the inverse binomial 
sampling plan determined by the two equations: 


Y(N — k, AQL,k) =a (14) 


Y(N — k, LTPD,k) = 1-8 (15) 
which in turn reduce to 


Bik — 1, AQL,N) =1—a (16) 
Bik — 1, LTPD, N) = 8 (17) 
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It is interesting to see the two similar pairs of equations [(12), (13)] and 


d [(16), (17)] and hence the connection between the two types of sampling plans 

r as reflected in their parameters related by n = N and c = k — 1 which seems 

r to make a lot of meaning even intuitively. This suggests that tables and charts 

) available to obtain binomial single-sample plans can be used to obtain inverse 

d binomial single-sample plans for given specifications. One has N = n and 

k=c+1. 
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On Methods of Constructing Sets of Mutually 
Orthogonal Latin Squares Using a 


Computer. I* 


R. C. Bose, I. M. CHAKRAVARTI AND D. E. KNutTH 


University of North Carolina and Case Institute of Technology 


This paper deals with the problem of finding sets of mutually orthogonal Latin 
squares of order 4¢ (where 4¢ — 1 is a prime power) based on orthogonal mappings 
of a group. For the group G we take the module G(2, 2¢) whose elements are vectors 
(@,, @2) where a; is a residue class (mod 2) and az is a residue class (mod 2¢), the 
addition being defined by (ai, a2) + (bi, b2) = (c1, c2) where a: + b: = ¢; (mod 2) 
and a2 + bz: = c: (mod 2¢). Then the search for orthogonal mappings is materially 
simplified by using a configuration based on the balanced incomplete block design 
with parameters v = b = 4¢ — 1,r = k = 2¢ — 1, = ¢ — 1. Using this method, 
two sets of five mutually orthogonal Latin squares of order 12 were obtained. 


1. INTRODUCTION 


A Latin square of order n is defined as an n X n square, the n’ cells of which 
are occupied by n distinct symbols (which in particular may be Latin or Greek 
letters or Arabic numerals) such that each symbol occurs once in each row and 
once in each column. The cell in the i-th row and j-th column is called the cell 
(i, j). Two Latin squares are said to be orthogonal if on superposition each 
symbol of the first square occurs exactly once with each symbol of the second 
square. There may exist more than two Latin squares such that any pair is 
orthogonal. Thus Figure 1 exhibits five mutually orthogonal Latin squares of 
order 12, where the symbols are the numerals 1, 2, --- , 12. 

Latin squares were first studied by Euler [6] towards the end of the 18th 
century. He showed that if n is odd or n = 0 (mod 4), then there exist at least 
one pair of orthogonal Latin squares of order n, and conjectured that ifn = 2 
(mod 4) then a pair of orthogonal Latin squares of order n cannot exist. This is 
trivially true for n = 2, and was proved by exhaustive enumeration for the case 
n = 6 by Tarry [13] in 1900. An excellent account of early researches on Latin 
squares is found in a paper by Norton [10] on the 7 X 7 squares. The first general 
results on the construction of mutually orthogonal Latin squares of a given 
order are due to MacNeish [8]. If 


v= Pi Po caw pa” 


* This research was supported in part by the United States Air Force, through the Air 
Force Office of Scientific Research of the Air Research and Development Command, under 
contract No. AF 49(638)-213. Reproduction in whole or in part is permitted for any purpose 
of the United States Government. 
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where p; , P2, *** » Px are distinct primes, we define the arithmetic function 
n(v) by 


nv) = min (pr", p2", rR » De’) ol 


MacNeish showed that we can always construct at least n(v) mutually orthogonal 
Latin squares of order v, and conjectured that n(v) is the maximum possible 
number. If we denote by N(v) the maximum number of mutually orthogonal 
Latin squares of order v, then MacNeish’s theorem may be stated as 


N() = no) 


and MacNeish’s conjecture as N(v) = n(v). This conjecture was shown to be 
wrong by Parker [11, 12], and following this Bose and Shrikhande [2] gave a 
counter example to Euler’s conjecture. It was later shown by the three authors 
[5] that Euler’s conjecture is wrong for all n > 6. 

These results make the present situation very interesting. There is very 
little knowledge about the exact nature of the function N(v) apart from the 
fact that N(v) < v — 1, and the equality is attained when v is a prime-power. 

We have given in this paper a method of searching for sets of mutually 
orthogonal Latin squares of order v = 4¢ where 4¢ — 1 is a prime power, and 
illustrated our method by considering the special case v = 12. The method is 
based on the concept of orthogonal mappings due to Mann [9]. He proved that 
a set of r orthogonal Latin squares based on a group G exists if G admits an 
r-fold complete set of mappings. To obtain such mappings he gave the method 
of automorphism which enables one to obtain a set of n(v) mutually orthogonal 
Latin squares of order v. He also gives some examples which show that there 
exist sets of orthogonal squares based on a group G which are not derivable by 
using the automorphism method. However, the number of squares in the sets 
given by him does not exceed n(v) in any case. 

For the group G we take the module G(2, 2#) whose elements are vectors 
(a, , @2) where a, is a residue class (mod 2), and a, is a residue class (mod 2#), 
the addition being defined by 


(a; , G2) + (d, , b2) = G , €2) 


where a, + b, = c, (mod 2) and a, + b, = c, (mod 2#). We then show that the 
search for orthogonal mappings is materially simplified by using a configuration 
based on the balanced incomplete block design with parameters 


v=b=4t—-1, =k=2t-1, \A=t-1. 


Using this method we were able to obtain two sets of five mutually orthogonal 
mappings of order 12 given in Table 2. The squares corresponding to the first 
of these sets are shown in Figure 1. 

Mendelsohn, Dulmage and Johnson [7] have also used the idea of orthogonal 
mappings, called orthomorphisms by them to produce sets of five mutually 
orthogonal Latin squares of order 12, and to study the properties of projective 
planes of different orders. Their operational method of finding orthogonal 
mappings is completely different from that used in this paper. 
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Work extending our methods in various directions is continuing and further 
results will be given in a subsequent paper. 


2. MuTuALLy ORTHOGONAL MappInGs oF A GRouP INTO ITSELF AND THE 
CONSTRUCTION OF MUTUALLY ORTHOGONAL LATIN SQuARES* 


Consider a finite group G of order n. Let a be a (1, 1) mapping of G into itself, 
the image of the element x e G being denoted by ax. Thus, the equation 


(2.1) ax =Cc 


determines a unique x e G given c e G and conversely. 

Let MM be the set of all (1, 1) mappings of G into itself. Since each mapping 
is a permutation of the elements of G, the number of mappings in 9M is n! We 
shall denote these by Greek letters a, 8, 7, etc., with or without subscripts, 
except that the identity mapping for which the image of every z e¢ G is x will 
be denoted by J. To each mapping a there corresponds a unique inverse mapping 
a-* such that if ax = a, then x = a~‘a. Also we may denote by fa the mapping 
for which the image of x is B[a(x)]. It is clear that Ba e 9% and that the associative 
law y(Ba) = (78)a holds. 

Consider an » X n square. We can make a (1, 1) correspondence between 
the rows of the square and the elements of G. Thus, by the row x we shall mean 
the row corresponding to the element x of G. Similarly we can make a (1, 1) 
correspondence between the columns of the square and the elements of G, so 
that the column y is the column corresponding to the element y of G. The cell 
of the square belonging to the row x and the column y is said to be the cell (z, y). 

Theorem 1. If in each cell (x, y) of an n X n square we put the element 
(ax,y of G (where a is a mapping belonging to 9) we get a Latin square L(a). 

To prove this we have to show that any element u of G occurs exactly once 
in a given row of L(a), and exactly once in a given column of L(a). This is 
equivalent to the statement that equation 


(2.2) (ax)y = u 
has a unique solution for a given x or a given y. This is obviously true since 
(2.3) zt=a(uy'), y= (x) u. 


Now consider two mappings a and 8 belonging to 9%. Consider a mapping 
for which the image of any element x of G is (ax)(8x)~*. If this mapping belongs 
to OM, i.e., is a (1, 1) mapping of G into itself, then the mappings a and £ will 
be defined to be orthogonal. Hence, the necessary and sufficient condition for 
the mappings a and £ to be orthogonal is that the equation 


(2.4) (az)(8x)"" = ¢ 


has a unique solution x e G for any ce G. 
Unless explicitly stated otherwise, all mappings referred to in the remaining 


* The results of this section are essentially given in Mann’s paper [9] referred to in the 
introduction. They are, however, given here for the sake of completeness and in the form in 
which we shall need them for subsequent developments. 
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part of the paper will be considered to belong to the set 91 of (1, 1) mappings. 
The following Lemma is easily proved: 

Lemma 1. If the mappings a and 8 are orthogonal, then the mappings ay 
and #y are also orthogonal. 

Theorem 2. The necessary and sufficient condition for the Latin squares 
L(a) and L(8) to be orthogonal is that the mappings a and 8 are orthogonal. 

Let a and 8 be two mappings belonging to 9, and orthogonal in the sense 
described above. To prove that L(a) and L(@) are orthogonal, we have to show 
that when superposed any pair of elements u, v e G occur together in exactly 
one cell. In other words, we have to show that the simultaneous equations 


(2.5) (ax)y=u, (Ba)jy=v 


have a unique solution (x, y) for any pair of elements wu, v belonging to G. It 
follows from (2.5) that 


(2.6) (ax) (Ba)~* = (ax)yy™ (Bx) = w™. 


Since a and 8 are orthogonal (2.6) uniquely determines x. Then y is uniquely 
determined by (2.5). In fact 


y = (on)"'u = (62)"v. 
Hence, the condition is sufficient. 

Conversely, if L(a) and L(8) are orthogonal, then the equations (2.5) have a 
unique solution. Hence (2.6) determines x uniquely for any given pair of elements 
u, v from G. If we choose v to be the unit element, then (ax)(@x)~* = u determines 
x uniquely for any given wu, and shows that the mappings a and 8 are orthogonal. 
This shows that the condition is necessary and completes the proof of the theorem. 

In the examples considered we have restricted ourselves to Abelian groups 
only. The group operation can now be regarded as addition so that G is a module. 
The necessary and sufficient condition for the mappings a and 8 to be orthogonal 
is that the equation 


ax — Br=c 


has a unique solution xz e G for any c e G. Theorem 1 now reads as follows: 
Theorem 1A. If in each cell (x, y) of an n X n square we put the element 
axz'+ y of G (where a is a mapping belonging to IN, then we get a Latin square 
L(a). 
Theorem 2 remains unchanged. 


3. On A METHOD oF SEARCHING FoR MUTUALLY ORTHOGONAL 
LATIN SQUARES OF ORDER 4¢. 


Consider the module 
G = G2, 2%). 
We shall suppose the standard order of taking the elements of G to be 
(3.1) (0, 0); ©, 1), --- ©, 2¢ — 1), (1,0), (1, D, --- , 1, 2#-— D) 


Let a; = I, a, , «++: a, be m mutually orthogonal mappings of G into itself. 
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Then m < 4¢ — 1. Let z; = (a, , b;) be the j-th element of G when written in 
the standard order. We shall use the notation 


P,(x;) = a; , P,(x;) = 6; . 


Thus P,(z;) stands for the first coordinate of the vector x; and P,.(z,) stands 
for the second coordinate. We have 


P,(x;) = a; = 0 (mod 2) na 5S oS 
1 (mod 2) if 2+1<j<4t 
P,(2;) = b; = j — 1 (mod 2%) f 1Sjs 2 
= j—1-— 2t(mod2é if 2@+1<j< 4t 
Let us consider the properties of the m X 4¢ matrix 
(3.2) C = [e.;] = Pez] 


Each element of C is either 0 or 1 (mod 2). Consider any two rows of C say 
i-th and k-th (0 <i<m,0<k< m,i # k). Let mq be the number of columns 
for which both i-th and k-th rows of C are occupied by 0. Similar definitions 
can be given for 7; , M0 , %1 . Since a,z; ,j = 1,2, --- , 44 is some permutation 
of the elements (3.1) we have 


Noo + Mx = 2t, Mio +My, = 2t 
Similarly 
Noo + rai = 2t, Na + ny = 2t. 


Again, since a; and a, are orthogonal a,;z; — a,2; ,j] = 1,2, +--+ , 4¢ is some 
permutation of the elements (3.1). Hence 


Noo + Mn = 2t, No + Mo = 2. 
It follows that 


(8.3) Noo = Ny = Mo = M1 = é. 
If we take the submatrix formed by any two rows of C, then the four possible 


- 0. 0% 0 


- occur as the columns of this submatrix with equal frequency ¢. Thus the pattern 
of the first coordinates of a,x; is known, and may be utilized for discovering a 
set of orthogonal mappings a, , a, °** ,Qm- 

We have seen that m < 4¢ — 1. A matrix C of order (4¢ — 1) X 4¢ with the 
required properties can be easily written down, from a solution of the balanced 
incomplete block (BJB) design 


(3.4) v=b=4t-1, r=k=2i-1, A=t-1. 
Let N = [n,,] be the incidence matrix of the design, i.e.,n,; = 1 or 0 according 
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as the i-th treatment does or does not occur in the j-th block. If we take the 
submatrix formed by any two rows of N the pair (1) occurs as a column exactly 


¢ — 1 times, since any two blocks of a symmetrical BIB design have exactly 
treatments in common. Since each row of N has k = 2¢ — 1 unities and 
b — k = 2t¢ zeros it follows that the pairs (*), (°) and (°) each occur ¢ times. 
Let j denote a 4¢ — 1 X 1 column vector, all of whose elements are unity, then 
the matrix {j, N] has the required properties. We shall, however, take for C 
the matrix obtained from [j, N] by interchanging the elements 0 and 1. Thus 
the initial column of C will consist of zeros. 

The BIB design with parameters (3.4) can be obtained by using the method 
given by Bose in [1] in the particular case when 4¢ — 1 is a prime power. In 
general, a solution for (3.4) exists whenever a Hadamard matrix of order 4¢ 
exists (see Bose and Shrikhande [3)). 

In the special case ¢ = 3, the matrix C can be obtained from a solution of 
the BIB design 

= b 


ll 
> 
ll 
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> 
ll 
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Suppose in the first row of C, we adjoin the elements 0, 1, 2, 3, 5 (mod 6) as 
second coordinates, obtaining 


(3.6) (0, 0), (0, 1) ? (0, 2), (0, 3), (0, 4), (0, 5), 


(1,0, (,), (1,2), (1,3), (1,4), @,5) 
viz., the elements of G(2, 6) in the standard order. According to the notation 
already introduced these are the elements x, , 22, --+ , 4:2. The identity mapping 
a, = I is then the mapping which transforms z; to z; . 
The next problem is to adjoin to each element c,; of the second row of C, 
element d.; belonging to the ring of residue classes (mod 6) in such a way that 
there exists a mapping a, e 9M and orthogonal to J such that 


et; = (Coz , dai) 
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For this it is necessary and sufficient that the following conditions be satisfied: 


(i) d,; takes the six values 0, 1, 2, 3, 4, 5 once for those j for which c,; = 0 
(i.e. for j = 1, 2, 3, 10, 11, 12), and the same holds for the six values of 
j for which c,; = 1 (i.e. for j = 4, 5, 6, 7, 8, 9). 

(ii) d.; — P,(x;) takes the six values 0, 1, 2, 3, 4, 5 once for those values of 
j for which c,; — c,; = 0 (mod 2) (ie., 7 = 1, 2, 3, 7, 8, 9) and the same 
holds for the six values of j for which c,; — ¢,; = 1 (mod 2), (i.e., j = 4, 5, 
6, 10, 11, 12). 


Let S, denote the set of all mappings satisfying these conditions. 

The total number of ways in which condition (i) can be satisfied by the adjoined 
elements d.; is 6!  6!. A program was devised with the help of a computer 
to select those ways for which the condition (ii) is also satisfied. In this way 


we obtained a set S, of 92 mappings belonging to SM and orthogonal to J, such 
that for any a, e S, 


P(@2%;) = C2; 


Applying the same procedure to i-th row of C we can get a set S; of mappings 
belonging to 9% and orthogonal to J such that if a; e S;,1<i< 4¢—1 


P,(ax;) = cy; « 


If we denote by n(S;) the number of mappings belonging to S, , then the values 


of n(S;) are shown in Table 1. These sets were actually obtained with the help 
of the computer. 


TABLE 1 


t 2 3 4 5 6 7 8 9 10 11 
n(S;) 92 108 = 80 80 8684 84 80 84 80 84 


We next determined all possible pairs a, , a; of orthogonal mappings, the 
first belonging to S, and the second to S,; . The further condition to be satisfied 
for this can be written as: 


P,(a.x;) — P2(a3x,) takes the six values 0, 1, 2, 3, 4, 5 once for those j for 
which c,; — ¢;; = 0 (mod 2), and the same holds for the six values of j for 
which c,; — c3; = 1 (mod 2). 


In this way 28 pairs a2 , a; satisfying these conditions were found. We thus 
have 28 sets of three mutually orthogonal mappings a, = I, a, , as for which 
P,(a;x;) = Ci; i= hs 2; 3 


Similar conditions can be written down fora mapping belonging to S, to be 
orthogonal to each of the mappings a, , a; . Testing the 80 mappings belonging 
to a, for orthogonality with each of the 28 pairs a, , a; , there were found just 
two sets a, = I, az , a3 , a, of mutually orthogonal mappings satisfying 


P,(a,2;) = Ci += 1,2,3,4 
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Finally, testing each mapping belonging to S, for orthogonality with the 
four mappings of the two sets of four mutually orthogonal mappings, we obtained 
one set of five mutually orthogonal mapping a, = I, az , a , a , @s 


P,(a,2;) = Ci; a => 1, 2, 3, 4, 5 


The mappings of the sets S, , S; , --- , Si; were tested for orthogonality with 
the five mutually orthogonal mappings J = a, , a , a: , a , as but none was 
found orthogonal to each of the five mappings. 

As a result of further extensive, but not exhaustive trials, another set of 
five mutually orthogonal mappings I = a, , a$ , a§ , a= and a% were discovered 
where at e S; (4 = 2, 3, 8, 9). 

The two sets of five mutually orthogonal mappings are displayed in Table 2 
where, instead of (a; , b;), we write a;b; for short, e.g., 14 instead of (1, 4). 


TABLE 2 





Set I 


S88ss- 


1 
00 
00 
00 
00 
00 


ator; 


The Set I was used to construct a set of five mutually orthogonal Latin squares 
of order 12 (applying Theorem 2 but replacing x; by j). These squares have 
the property D, (see Bose and Nair [4]), that all the squares can be obtained 
from the basic square by row permutations. Similarly, the Set II can be used 
to obtain another set of five mutually orthogonal squares of order 12 possessing 
the property D, . 


4. CoNCLUDING REMARKS 


The application of the general method described in Section 2 can be extended 
in various ways. For example, instead of groups G(a, b) we may take the group 
G(a, , @,, +++ , a) where a; is a residue class (mod 7,). 

It is also possible to modify the method so as to obtain sets of mutually 
orthogonal Latin squares not possessing the property (Dj). Actually a large 
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number of mutually orthogonal Latin squares of order 12 of this type have 
been obtained. This work will be described in a subsequent report. 

Work is now proceeding on the construction of mutually orthogonal Latin 
squares of order 20. 
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Book Reviews 


Henry Scuerrf, The Analysis of Variance, (New York: John Wiley & Sons, 1959), 
477 pgs. 


This book is an excellent, clearly written text on the theory of analysis of variance. The 
author fashions a compromise between the aim of reaching students whose mathematics 
barely includes some calculus and the aim of surveying modern theoretical research in the 
field. As a result he covers the middle ground best, but the student lacking mathematical 
depth will find the going hard, and it is doubtful that the book will give much guidance to 
future theoretical research. The student will find ample exercises, expecially in Part I. There 
is extensive footnoting directed at the reader who wants more depth than the main text affords. 

The author’s viewpoint is that of a mathematical statistician; that is, the first concerns 
are with the definition of mathematical models and with the application of the methods of 
mathematical statistics to these models to produce appropriate point estimates, confidence 
intervals and hypothesis tests. Still, a secondary aim of relating the theory to practice is not 
neglected and there is considerable discussion of where the models came from, how to decide 
among competing statistical methods, how to carry out computations, etc. 

A division is made into Parts I and II where Part I, comprising about sixty per cent of the 
text, deals with that part of the theory which the author considers to have “‘jelled”, namely 
the theory related to the standard Model I with fixed effects and random (often normal) errors. 
The first two chapters of Part I are the most mathematically abstract and provide a general 
basis of least squares theory for the specific topics of the remaining four chapters. The less 
mathematical reader is encouraged to persevere and may be helped by the fresh geometrical 
treatment. There are several mathematical appendices explaining matrix notations and geo- 
metrical ideas. Chapter 3 is mostly devoted to multiple comparisons including the Scheffé 
method based on the F-statistic, the Tukey method based on the studentized range, a com- 
parison of these, and some discussion of more general methods of multiple comparisons. Chap- 
ter 3 also has a bonus section on how to compare variances robustly. The remaining chapters 
of Part I present material which is more standard, namely theory for arrays with two, three 
and more factors (Ch. 4), theory for various incomplete arrays and nested designs (Ch. 5), 
and the analysis of covariance (Ch. 6). In Part II Chapters 7 and 8 develop, by special cases 
and not in full generality, the models which arise from the assumption that the factor levels 
and hence the associated effects are randomly drawn from some population, i.e. the models 
sometimes called Model II, III, etc. Chapter 7 includes the interesting methods of Bulmer 
(Biometrika 1957) on confidence intervals for variance components. Chapter 9 treats the 
slightly different type of random effects models where the randomness arises from the random 
choice of a randomized design e.g. randomized blocks or some types of randomized latin 
squares. Chapter 9 includes discussion of the permutation tests applicable to randomized 
designs. 

Finally, Chapter 10 deserves special mention since in it the author presents an admirable 
summary and synthesis of the scattered results on the robustness of the standard normal 
theory techniques. Specifically, the effects on F tests of non-normality (of third and fourth 
moments) of errors, of heterogeneity of error variance and of non-independence of errors are 
considered. A brief section on transformations is included. 


A. P. Dempster 
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Inspection and Quality Control HANDBOOK (Interim) H 106, ‘Multi-Level Con- 
tinuous Sampling Procedures and Tables for Inspection by Attributes,’ 31 October 
1958, Office of the Assistant Secretary of Defense (Supply and Logistics), Washington. 


This handbook was developed by the Air Materiel Command, USAF, following useful 
applications of 3-level continuous sampling plans, especially to products comprising large 
complex units. It provides a limited set of multi-level continuous sampling plans, with initial 
sampling rates of 1/2 and 1/3 and with 1, 2, 3, 4 and 5 sampling levels, together with instruc- 
tions for selecting sampling plans and administering them. These are intended for application 
“to procurement, storage, and maintenance inspection and test operations where lot accumu- 
lation prior to inspection is not feasible.’ The plans are indexed in terms of AOQL (Average 
Outgoing Quality Limit), with a range of AOQL values from 0.10 to 15 per cent defective, and 
as such are applicable to inspections and tests that are nondestructive. Tables and curves 
are given to aid in the selection of the number of sampling levels (k) to use in specific cases; 
these include curves of AOQ (Average Outgoing Quality) and of AFI (Average Fraction In- 
spected) for the several levels when the process average quality of submitted product is equal 
to the AOQL. Procedures for selecting sample units on a random basis are also given. 

Continuous sampling plans, whether single-level or multi-level are designed for appli- 
cation where, for any one of a number of reasons, it is advantageous to inspect and to dispose 
of product on a continuous basis rather than on a lot-by-lot basis. Advantages that normally 
accrue include: no need for forming lots, less storage space, quick release of finished units, 
quicker detection of faulty processing. Single level continuous sampling plans of the general 
type under consideration here were published by Dodge in 1943 (CSP-1) and by Dodge and 
Torrey in 1951 (CSP-2 and CSP-3). For these plans, inspection alternates between 100 per 
cent inspection and sampling, with the protection to the consumer expressed in terms of 
AOQL. Two parameters are involved: 7, the number of successive items that must be free 
of defects in order to qualify for a shift to sampling inspection, and f, the fraction of items 
inspected during sampling, e.g., 1/2, 1/10, etc. For CSP-1, a single defective found during 
sampling calls for a return to 100 per cent inspection; for CSP-2, one defective is allowed, 
but a second defective close after the first calls for return to 100 per cent inspection; for CSP-3, 
the procedure is the same as for CSP-2, except that a special feature provides extra protection 
against spotty quality, by requiring the inspection of the next four items following an allowed 
defective, and immediate return to 100% inspection. if one of the four-is found to be defective. 

Multi-level continuous sampling plans, allowing reduced sampling on a geometric scale 
from f to f?, f*, f4--- f* were published by Lieberman and Solomon in 1954. Under these plans, 
100 per cent of the items are inspected until a stated number, 7, are found free of defects. In- 
spection is then reduced to a fraction, f, of the items submitted. If, while sampling at the f 
rate, 7 successive inspected items are found free of defects, the sampling rate is decreased to 
a new fraction, f?. This procedure continues through the several levels of the plan, but, at 
any level, if a defective item is found before 7 successive items are found clear of defects, the 
sampling rate is increased by one level. This constitutes a modification of CSP-1, by provid- 
ing several levels instead of a single level of inspection, which permits economy in inspection 
effort if quality is consistently good. A common criticism of this type of plan, when used with 
several sampling levels, is that if quality gradually or even suddenly deteriorates, the plan 
is slow to respond, moving step by step from smaller levels to tighter levels of sampling, with 
somewhat annoying deliberation. Modified rules for shifting between levels to give “‘tightened” 
multi-level plans have been provided in a 1956 paper by Derman, Littauer, and Solomon. 
However, Handbook H-106 makes use of what is called a ‘‘special procedure’’, to differentiate 
between “‘the random occurrence of a deféctive and inferior or spotty quality’. This is the 
procedure of CSP-3, requiring the inspection of “the next four items’”’ immediately following a 
defective item. Thus the plans in H-106, developed from an earlier Air Force Manual, AMC 
Manual 74-23, have been described by Ireson and Biedenbender as ‘‘an extension of Dodge- 
Torrey CSP-3 plans to additional sampling levels.”’; the k = 1 plans are CSP-3 plans. In such 
technical handbooks, the inclusion of references to source material would be helpful to the 
serious student. 

The Handbook has three parts: I: General Information on the Sampling Plans, which 
explains terms, describes the sampling plans, and gives procedures for selecting and using 
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the plans; II: Techniques and Procedures, which gives procedures for random selection of 
sample items, for truncating the number of uninspected items, and for the determination of 
presented and outgoing quality; and III: Appendices, giving an extensive set of curves for 
determining AOQ (Average Outgoing Quality) and AFI (Average Fraction Inspected). In- 
cluded are tables of sampling plans for f = 1/2 and f = 1/3 and 1, 2, 3, 4, and 5 sampling 
levels, with AOQL values from 0.1% to 15% defective. 

On the matter of selecting a sampling plan, two areas are covered: (a) how to choose be- 
tween f = 1/2 and f = 1/3, and (b) how to choose k, the number of sampling levels, whether 
k = 1, 2, 3, 4, or 5, and hence whether to carry the sampling rates to f, f*, f*, f or f®. As for 
(a), the suggestion is made that the AFI curves are of assistance in choosing between f = 1/2 
and f = 1/3, a point not quite clear to the reviewer, but that, in general, f = 1/3 is best suited 
for production line work where the process can be readily controlled and where the production 
rate is quite high. What constitutes ‘‘quite high’ is not clear; undoubtedly experience will 
be a guide. 

As for (b), once an AOQL value has been selected and a choice made between f = 1/2 
and f = 1/3, the choice of plan, whether a 1, 2, 3, 4, or 5-level plan, is determined by N, the 
“contract size or production rate”. A two-way table is provided to facilitate this choice— 
single level plans where both N and AOQL are small, and 5 level plans when both are large. 
This is one possible choice, the one that the handbook user would be inclined to use on reading 
the general instructions. But whether this is generally a good choice is open to certain questions. 

First of all, should a 5-level plan be the preferred choice just because the contract size is 
large enough to permit economies in AFI? In the event of quality deterioration, 5-level plans, 
say with f§ = (1/2)§ = 1/32 or especially with f> = (J/3)§ = 1/248, can take a long time to 
return to 100% inspection, even with the “inspect the next-4’’ provision, i.e., they have an 
inherent lack of “local stability”. Two or perhaps three levels, say with f* = 1/8 or 1/27, 
may normally be slow enough, but 5-levels, to some quality control engineers, may be judged 
far too slow and hazardous when it comes to important quality characteristics. It is felt that 
this feature should be discussed, leaving the door open for, perhaps even encouraging, the 
more frequent choice of, say, 2 and 3 levels on judgment grounds. What seems especially 
appropriate here is a “thinking clause’, something that is provided to a degree, perhaps, in 
paragraph 3.6e which says: “The plans in Table II are recommended for normal use, but it 
may be advisable to select other plans for particular uses.” 

Second, why solve for a minimum AFI when the process average is equal to the AOQL? 
Solution for a process level of say 3/4 or 2/3 the AOQL would have seemed more useful. How- 
ever, this would presumably have called for wider use of k = 4 and 5 so the actual solution 
would seem preferable to those who, in the interests of protection against spotty quality, will 
prefer to avoid higher k values. 

Third, just what is meant by “production rate” in the phrase “contract size or production 
rate’ is not clear. If a contract is to run for years “with or without a fixed quantity being 
specified”, the recommendation is to enter the table “with the approximate production rate’. 
Is this the annual rate? Elsewhere the phrase “‘estimated quantity to be produced in a pro- 
duction run” is suggested, and again N is the “production rate or lot size’. For intermittent 
production, does one use the quantity in a “production run” or “the contract size’’? If the 
choice of plan is to depend primarily on N, then N should be much more clearly defined. 

The Handbook refers to the AOQL as describing ‘“‘the sampling risk involved in using plan.’ 
This seems most unfortunate. With the standard usage of the terms “producer’s risk” and 
“Consumer’s risk’”’ in quality control literature, this use of the term “risk’’ can only confuse 
and should be avoided in the next printing or revision. 

No mention is made of the usual and highly desirable provision that the 100% inspection 
or screening operations be performed by a “‘screening crew” of the producer rather than by 
the sampling inspector, and that during such screening periods the sampling inspector carry 
on a verification inspection. This provision is normally considered essential for single level 
plans. 

From the AOQ curves shown on page 23, it appears that the plans for f = 1/2 andk = 4 
and 5 have AOQL values of about 16% instead of 15%; curves for k = 2 and k = 3 should 
be added or their absence explained. 
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The tabulation in Table I of the Handbook calls for a few comments. This table provides 
“a comparison between the AOQL values of this handbook and those associated with the 
sampling plans of MIL-STD-105’’, presumably to facilitate substitution of one type of plan 
for the other. First of all, two columns of values are listed under MIL-STD-105A with a 
value of AOQL for each AQL. What are these AOQL values? Do they correspond to the 
sampling plans associated with one Sample Size Code Letter, say M, and if so, why? Do they 
assume an infinite lot size? Actually, the AOQL values vary quite widely among the several 
different Code Letter sampling plans indexed under a single AQL value in the MIL-STD. 
And again: Is the AOQL of an H-106 plan being compared with the AOQL of the sampling 
plan for “normal inspection” of MIL-STD-105? If so, this could be misleading because the 
protection offered by MIL-STD-105A is determined not alone by the sampling plan for 
“normal inspection”, but by the over-all system of procedures, involving both “normal in- 
spection” and “tightened inspection”, the latter being required when submitted quality 
runs significantly worse than the AQL. The protection offered by the over-all procedure is 
thus tighter than that offered by the “normal inspection” plan alone. Thus, without some 
explanation, it will not be clear to the reader just what relationship is implied by the tabulated 
comparison of Table I. If he should try to use this table as a basis for substituting one type 
of procedure for the other (H106 vs. MIL-STD-105A), then, lacking some information regard- 
ing statistical equivalence or the lack of it, he might well run into difficulties. 

In summary, the Handbook provides a full set of multi-level continuous sampling plans 
based on the Dodge-Torrey CSP-3 single level plan, and indexed by AOQL (0.1% to 15.0% 
defective). These plans have already been used extensively by the Air Force and should find 
wide application in situations that require initial sampling rates of f = 1/2 or f = 1/3. The 
choice of number of sampling levels (k = 1, 2, 3, 4, or 5) is based on minimizing the AFI 
(average fraction inspected) for the given “size of contract’’, when presented quality equals 
the AOQL. In practice, the choice of number of sampling levels will probably be determined 
largely by matters of judgment and experience, but even here the Handbook can serve a 
useful purpose by noting the optimum choice for a stated set of conditions. Notably missing 
is an explanation of the meaning of a tabulated comparison of H106 plans and MIL-STD-105A 
plans and how this can be used in practice. 

H. F. Dodge 





Vor. 2, No. 4 TECHNOMETRICS 


NOTICES 


A New FErEature 


Technometrics will contain a new section devoted to the announcement and 
brief description of computer programs of interest to statisticians beginning 
with the February 1961 issue. The new section will be run on a trial basis for 
one year. Each computer program announcement is to contain: 


1. A brief problem description, 
2. The type of computer for which the program is applicable, 
3. The author’s name and address, 
4. A brief description of the program including: 
Limitations, 
Auxiliary equipment requirements, 
Statements on accuracy, 
Availability of sample problems, 
Running time estimates, 


Program storage requirements. 


The readers of Technometrics are invited to send the above information on 
computer programs for consideration for publication to: 


Dr. Frep. C. Leone, Director 

Statistical Laboratory, Case Institute of Technology 
10900 Euclid Ave. 

Cleveland 6, Ohio. 


All programs submitted for publication must be available for distribution 
subject only to nominal costs for the reproduction and mailing of program tapes, 
cards, etc. Inquiries about programs are to be addressed to the authors. Comments 
and critiques of the announced programs may be sent to Dr. Leone and will 
be published when considered appropriate. 
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LETTERS TO THE EDITOR 


To The Editor: 


In reference to programming Fisher’s exact method for 2 X 2 contingency 
tables [Robertson, W. H., Technometrics, 2, 103 (1960)], I would like to report 
that we have found the use of the first four terms of Stirling’s formula very 
satisfactory as an approximation to the required factorials. This expression is 


oe ae sg oh el 
nl = V dan n'e +o tate sii) 


Our program uses exact values for factorials zero and one and the value 
given by the above approximation for n > 2. This approximation gives 1.999986 
for factorial two and improves (percentagewise) as the number increases. Thus 
a@ saving of time can be effected for large numbers. The final probability is 
accurate to more than three decimal places. 


Lloyd S. Nelson 
' General Electric Lamp Division 
Cleveland 12, Ohio 
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Errata 


“The Percentile Points of Distributions Having Known Cumulants” 
R. A. Fisher anp E. A. Cornisu, TECHNOMETRICS. Vol. 2, No. 2, May 1960 


Page 210, the expression 


‘should read 


Page 222 the expression 


1 (2) —z/a dX 
(n — 1)! \a . a 


should read 

1 (2) = dX , 
(n — 1)! \a a 
Page 225 


N. L. Jonson and B. L. Wetcu (1399) should read N. L. Jonnson and B. L. 
WEtcx# (1939). 





“Order Statistics from the Gamma Distribution” 
S. S. Gupta, TECHNOMETRICS, Vol. 2, No. 2, May 1960 


The entries on the last six rows of Table III C, D, and E, page 256, 259 and 
260 respectively, should be permuted in the following way: 


Table III C to III E 
Table III E to III D 
Table III D to III C 


Copies of the last six lines of Tables III C, D and E are given below for those who 
may wish to paste these corrected entries over those appearing in the original 
tables. 
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ERRATA 


Paste over the bottom of Table III C 


4.54813 5.14540 5.50628 6.19170 7.10651 8.23505 9.48767 10.36591 
5.50628 6.08801 6.43922 7.10651 7.99876 9.10309 10.33359 11.19903 
6.43922 7.00552 7.34770 7.99876 - 57152 9.95530 11.16718 12.02157 
7.34770 7.90014 8.23442 8.87152 72774 =10.79414 12.83523 
8.51577 9.05283 9.37843 10.00031 12. BE 23 11.88592 13.06424 13.89855 
9.37843 9.90592 10.22614 10.83852 11.66545 12.70105 13.86856 14.69547 


For given N, k, (1.4 k < N) and a the entries in the above table are the values 


y k-l N-k 
of y for which rentiece | G,, (z) [1-0,.(2) ] &,(z)dz = a, where &,.(z) and G,(z) refer 
° 


to the p.d.f. and c.d.f. of the standardized gamma chance variable, respectively. 


Paste over the bottom of Table III D 


-8176 6.484 +o a 8.63879 9.86691 11.20653 12.14442 14.20609 

“Bo 5 7. “s25ee 3 y 9.60588 10.79446 12.10998 2 15. 06188 
8.52889 088 8 10.54591 11.70644 928 15. 90776 
9.49931 9. “Book 11.46333 12.60031 13. 4606 
10.74055 11.08 ad 12.64750 13.75886 15.00395 15. ay: 3963 
11.65370 11.99 -12.64750 13.52530 14.62071 15.85111 16.72001 18.66580 


For given N, k, (1 5 k & N) and @ the entries in the above table are the values 


' ¥ kel N-k ; 
of y for which wees j. G, (z) (1-0, (2) &,(z)dz.= a, where g,.(z) and 0), (z) refer 


to the p.d.f. and ¢.d.f. of the standardized gamma chance variable, respectively 


Paste over the bottom of Table III E 


7.78018 8.21457 9.03053 10.10404 11.40867 12.83660 13.82748 15.99525 

8 80 9.32261 10.10403 11.13707 12.40017 13.79114 14.76083 16.89148 
10.38453 11.13707 12.13657 13.36516 14.72525 15.67679 17-7369 

11.40795 12.13657 13.10824 14.30798 15.64197 16.57810 18.65034 

: 12.71279 13.41591 14.35168 15.52608 16.83093 17. 74947 19.78918 

-71279 13.30944 13.67028 14.35768 15.28085 16.42973 17.71652 18.62317 20.64808 


For given N, k, (1 = k 3 N) and a the entries in the above table are the values 
¥ x1 N-k 
of y for which TOME i G. (z) [1-2,(2) | g,,(z)dz = a, where 8, (z) and G..(z) refer 


to the p.d.f. and c.d.f. of the standardized gamma chance variable, respectively. 


Several issues of the August 1960 Technometrics were found to be mis-collated. 
That is, several pages were missing and others repeated. Subscribers are asked 
to inspect their August 1960 issue, and to return imperfect copies to 


Mr. Willis Shell 

William Byrd Press 

P.O. Box 2W 

Richmond 5, Virginia 
for replacement. 
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